Vistring Technologies

Where Video
Creation
Meets
Technology:
GAI at Vistring

Core Technologies

HiveNet is a cutting-edge large visual model that revolutionizes video generation, much like ChatGPT does in the realm of large language models.

Face Mocap
(Motion Capture)

HiveNet captures and transfers facial dynamics to 3D characters with ease, democratizing high-fidelity facial animation, expanding possibilities in VR, captivating films, interactive avatars, and beyond.

HiveNet ranks top in the NoW benchmark for monocular 3D head modeling

Real-time high resolution 4D Free-view (3D+time) Video

HiveNet enables real-time, high-resolution video rendering at any view point given sparse input cameras. e.g. 360 degree video from only 24 input cameras. It's a cost-effective solution that can boost production flexibility, revolutionize filming, and enrich 3D conferencing with more engaging communication.

HiveNet ranks top on COCO-WholeBody

Read More: https://github.com/IDEA-Research/DWPose

DW-Pose 2D Human
Pose Estimation

Capable of capturing detailed movements of both hands and faces, HiveNet excels in real-time location of human key points from an image or video. This is a significant extension to ControlNet, going beyond just the body to include intricate hand and face movements. It overcomes pose estimation challenges and boosts motion analysis and animation in fields like virtual try-ons.

Speech-driven
Holistic 3D Expression
& Gesture Generation

HiveNet can generate harmonious and consistent human motion sequences from speech, adeptly captures a variety of movement types, including slow motions, jittering motions, and realistic & agile motions, reaching a level of realism that even surpasses the acting abilities of most ordinary people. It is poised to transform VR, filmmaking, and digital education, making high-fidelity animation accessible and cost-effective.

*Please note the video has sound;
adjust your volume before you hit play.

AI Teleprompter

Millions of video creators face the challenge of memorizing scripts while trying to appear natural on camera. HiveNet employs advanced speech recognition and natural language processing to synchronize with the speaker's pace, ensuring natural delivery and maintained eye contact.

EnglishEspañol

*Please note the video has sound;
adjust your volume before you hit play.

Text to Video

HiveNet stands out for its ability to generate human portraits that are more natural and vivid compared to many other solutions, such as Runway Gen2. Not only does it allow for the generation of short visual snippets, approximately 3-5 seconds in length, from a single sentence, but it also transforms entire text scripts into long-form talking videos. This significantly lowers the barriers to video production while also dramatically enhancing the efficiency, creativity, and originality of content creation.

Prompt: A girl wearing a red dress is reading a book under the tree in the park.

*All elements in the scene are generated, including the figures, voice, background, props, as well as the flowing hair strands and shifting lights and shadows.

*Please note the video has sound; adjust your volume before you hit play.