HunyuanWorld-Voyager is an advanced interactive RGBD video generation model designed to create immersive 3D experiences. It allows users to generate world-consistent 3D point-cloud sequences from a single image, using user-defined camera paths. This powerful tool is ideal for those looking to explore and reconstruct 3D environments with ease and precision.
Benefits
HunyuanWorld-Voyager offers several key advantages:
- World-Consistent Video Diffusion: The model generates aligned RGB and depth video sequences, ensuring global coherence and consistency.
- Long-Range World Exploration: With an efficient world cache and auto-regressive inference, Voyager allows for iterative scene extension with context-aware consistency.
- High Performance: Voyager has outperformed other methods in various metrics, including WorldScore Average, Camera Control, Object Control, Content Alignment, 3D Consistency, Photometric Consistency, Style Consistency, and Subjective Quality.
Use Cases
HunyuanWorld-Voyager can be used in a variety of applications, including:
- 3D Reconstruction: Generate aligned depth and RGB video for efficient 3D reconstruction.
- World Exploration: Create 3D-consistent scene videos for exploring virtual environments.
- Data Generation: Use the data engine to generate scalable data for RGB-D video training.
Requirements
To run HunyuanWorld-Voyager, you will need:
- Model: HunyuanWorld-Voyager
- Resolution: 540p
- GPU Peak Memory: 60GB (minimum), 80GB (recommended)
- GPU: NVIDIA GPU with CUDA support
- Operating System: Linux
Installation
To install HunyuanWorld-Voyager, follow these steps:
- Clone the repository:
git clone https://github.com/Tencent-Hunyuan/HunyuanWorld-Voyagercd HunyuanWorld-Voyager- Create and activate a conda environment:
conda create -n voyager python==3.11.9conda activate voyager- Install PyTorch and other dependencies:
conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.4 -c pytorch -c nvidia- Install pip dependencies:
python -m pip install -r requirements.txtpython -m pip install transformers==4.39.3- Install flash attention v2 for acceleration:
python -m pip install flash-attn- Install xDiT for parallel inference:
python -m pip install xfuser==0.4.2Inference
To perform single-GPU inference, use the following command:
cd HunyuanWorld-Voyagerpython3 sample_image2video.py \--model HYVideo-T/2 \--input-path "examples/case1" \--prompt "An old-fashioned European village with thatched roofs on the houses." \--i2v-stability \--infer-steps 50 \--flow-reverse \--flow-shift 7.0 \--seed 0 \--embedded-cfg-scale 6.0 \--use-cpu-offload \--save-path ./resultsTo generate a video with 8 GPUs, use the following command:
cd HunyuanWorld-VoyagerALLOW_RESIZE_FOR_SP=1 torchrun --nproc_per_node=8 \sample_image2video.py \--model HYVideo-T/2 \--input-path "examples/case1" \--prompt "An old-fashioned European village with thatched roofs on the houses." \--i2v-stability \--infer-steps 50 \--flow-reverse \--flow-shift 7.0 \--seed 0 \--embedded-cfg-scale 6.0 \--save-path ./results \--ulysses-degree 8 \--ring-degree 1To run the Gradio demo, use the following command:
cd HunyuanWorld-Voyagerpython3 app.pyAdditional Information
HunyuanWorld-Voyager is an open-source project with contributions from various researchers and collaborators. If you find Voyager useful for your research and applications, please cite using the provided BibTeX.
For more details, visit theHunyuanWorld-Voyager GitHub repository.
This content is either user submitted or generated using AI technology (including, but not limited to, Google Gemini API, Llama, Grok, and Mistral), based on automated research and analysis of public data sources from search engines like DuckDuckGo, Google Search, and SearXNG, and directly from the tool's own website and with minimal to no human editing/review. THEJO AI is not affiliated with or endorsed by the AI tools or services mentioned. This is provided for informational and reference purposes only, is not an endorsement or official advice, and may contain inaccuracies or biases. Please verify details with original sources.
Comments
Please log in to post a comment.