Modeling “knowledge”. Doesn’t make much sense. AI is so much more than that. And the knowledge, the rules themselves, have only a limited use. If we’re talking about organizing information, sure, maybe the text itself is good enough. But when we’re learning we care much more about developing a model of the thing, instead of remembering a set of rules.
Because the formalized knowledge can only be really understood with a world model. And a world model is *necessary* to tie pieces of knowledge together, because unless there are formalized rules that can act as a bridge between the pieces of knowledge, there’s no formal way to do so. And most of the time there’s not a formal bridge. So a connection needs to be inferred from knowing where the pieces of knowledge come from in the world model.
I’m convinced now, at this point, that AI is not as much about extracting knowledge and rules and manipulating them, and that it’s more about building a model of something. An internal representation of a thing, where we can put inputs into the model and get outputs. A probabilistic model, that’s all. So yeah, ChatGPT is it. GPT. Transformers. Whatever. Whatever model gets the lowest test perplexity. These models all learn the same features anyways.
The only problem with the current crop of transformers is their inability to make generalizations. Unless, well, they do that under the hood. The likely do, since generalizations are an efficient representation of the data. *But*, they maybe don’t have the *most* efficient representation, especially when it comes to video models, because the models should be able to reuse representations to represent the same object albeit from a different angle. What the heck, maybe modern video models do that. I’m not convinced. That’s what tape is trying to do. To be more efficient.
So, I know now that the approach I had to zilwiki was incomplete. Not because it was wrong, but the sole act of organizing information is incomplete without a world model. Because it’s too hard without a world model, and the user would still want to query the world model. “What would happen if I were to do X”, without any pre-existing data on that exact scenario, it needs a world model to infer the answer.
Similarly, the act of software engineering is that each SWE has a model of the system in their heads, and they have a sense of how it works, how to change it, and how it operates in different scenarios. This area, unlike general information, is ripe for formalization, because the code itself is all formal anyways. Code should be able to reason about code. It’s just code. It’s just logic. We can apply logic to logic.
A good chunk of software engineering is tooling, working with tools and making sure they don’t break and handling all the edge cases that come up with them. Another part is the business logic — what is a thing, how does it work, how do we want it to work. Ideally, all of software would be the latter. And we’d work closely with business and product people to figure out what the business logic is. The software engineer essentially becomes the “formalizer”, i.e. turning informal logic and specifications into formal logic and specifications. Similarly, data science is entirely specifications, logic of how to handle the data. The code should handle the data science logic naturally so nothing more is needed. The more we can formalize in code the less the software engineer needs to model in their head. Ideally, specifications would be formalized and the compiler would check that they’re consistent, and maybe some desired properties would be formalized and the compiler would check if they hold. Unclear exactly what the final system would look like.
So moving forward, I’ll work on tape until I know it either works or doesn’t. If it doesn’t, I’ll move onto software and not look back, accepting that gradient descent is the best way to learn models and leave it to smarter people to come up with better ideas, because the only idea I could come up with didn’t work. If it works, then I’ll worry about that then, because hey that would be great.