Our humanoid avatar, Kilian, has literally gained the ability to see: Thanks to specialized multimodal vision models, he can now not only recognize objects but also capture printed text and newspaper articles in real time via the camera. The system analyzes the text in a matter of seconds, summarizes the key points, and is immediately ready for an in-depth discussion. This enhancement was developed as a showcase for the Hessian Ministry of Digitalization and Innovation.
In the latest Weekly Talk Kilian Reads Along: From Document to AI Conversation, we take an in-depth look at the complex system architecture behind the project.
If you'd like to try out the avatar yourself and hold a document up to the camera, you can find the live showcase at INOSOFT.de/kilian.
June 26, 2026