about intelligence
data + code is already intelligence
but that’s digital intelligence
now multimodal llm brings recognition intelligence and instruction following intelligence
1) recognition means it can understand text, image and video. it can extract key information from the raw information. now it can digitize the world.
2) instruction following means it understands the text prompt and translate it to detailed and accurate execution steps.
so combing llm + tools (code execution, function call etc)
it can generate accurate code from ambiguous high level natural language
it can generate intelligence on the fly.
and then it can run the code
so today it’s already intelligent.
the main gap today is the self-learning capability and sophisticated-bility
self-learning means it can be like a student, who know whether the info and knowledge is and then proactively reading them and gain understanding, gain knowledge, and through exercises, gain experiences.
sophisticated-bility means it can quickly realize the wrong path and correct itself, able to perform sophisticated tasks end to end.