Models can now be quickly used without using this repository using the following code. This can be set up also in other repositories. It has not been widely tested, so we welcome any bug reports / ...
A simple, yet effective, cross-modality framework built atop frozen LLMs that allows the integration of various modalities (image, video, audio, 3D) without extensive modality-specific customization.