pIXELsHAM.com – Page 18

A.I., software

Fal Video Studio – The first open-source AI toolkit for video editing

pIXELsHAM.com

Jan 25, 2025

https://github.com/fal-ai-community/video-starter-kit

https://fal-video-studio.vercel.app

🎬 Browser-Native Video Processing: Seamless video handling and composition in the browser
🤖 AI Model Integration: Direct access to state-of-the-art video models through fal.ai
- Minimax for video generation
- Hunyuan for visual synthesis
- LTX for video manipulation
🎵 Advanced Media Capabilities:
- Multi-clip video composition
- Audio track integration
- Voiceover support
- Extended video duration handling
🛠️ Developer Utilities:
- Metadata encoding
- Video processing pipeline
- Ready-to-use UI components
- TypeScript support

A.I.

Tencent Hunyuan3D – an advanced large-scale 3D synthesis system for generating high-resolution textured 3D assets

pIXELsHAM.com

Jan 25, 2025

https://github.com/tencent/Hunyuan3D-2

Hunyuan3D 2.0, an advanced large-scale 3D synthesis system for generating high-resolution textured 3D assets. This system includes two foundation components: a large-scale shape generation model – Hunyuan3D-DiT, and a large-scale texture synthesis model – Hunyuan3D-Paint.

The shape generative model, built on a scalable flow-based diffusion transformer, aims to create geometry that properly aligns with a given condition image, laying a solid foundation for downstream applications. The texture synthesis model, benefiting from strong geometric and diffusion priors, produces high-resolution and vibrant texture maps for either generated or hand-crafted meshes. Furthermore, we build Hunyuan3D-Studio – a versatile, user-friendly production platform that simplifies the re-creation process of 3D assets.

It allows both professional and amateur users to manipulate or even animate their meshes efficiently. We systematically evaluate our models, showing that Hunyuan3D 2.0 outperforms previous state-of-the-art models, including the open-source models and closed-source models in geometry details, condition alignment, texture quality, and e.t.c.

blender

Florent Poux – Create 3D Point Cloud Renderings with Blender

pIXELsHAM.com

Jan 25, 2025

https://towardsdatascience.com/the-blender-handbook-for-3d-point-cloud-visualization-and-rendering-1700ebe69c7b

hardware, photography

SLAM XCAM 8K VR180 3D Camera

pIXELsHAM.com

Jan 25, 2025

https://www.kickstarter.com/projects/vr1803dcamera/slam-vr180-3d-ai-camera-smarter-smoother-sharper?ref=axcdoc

Watch this video on YouTube

blender

Physical Open Waters Used To Create The Water Scenes In ‘Flow’ Is Now Available Publicly

pIXELsHAM.com

Jan 25, 2025

https://www.cartoonbrew.com/tools/the-custom-blender-plug-in-that-was-used-to-create-the-water-scenes-in-flow-is-now-available-publicly-245166.html

https://blendermarket.com/products/physical-open-waters

Watch this video on YouTube

A.I., software

Invoke.com – The Gen AI Platform for Pro Studios

pIXELsHAM.com

Jan 25, 2025

https://www.invoke.com

Invoke is a powerful, secure, and easy-to-deploy generative AI platform for professional studios to create visual media. Train models on your intellectual property, control every aspect of the production process, and maintain complete ownership of your data, in perpetuity.

Watch this video on YouTube

Watch this video on YouTube

Watch this video on YouTube

Watch this video on YouTube

A.I., Featured

How does Stable Diffusion work?

pIXELsHAM.com

Jan 24, 2025

https://stable-diffusion-art.com/how-stable-diffusion-work/

Stable Diffusion is a latent diffusion model that generates AI images from text. Instead of operating in the high-dimensional image space, it first compresses the image into the latent space.

Stable Diffusion belongs to a class of deep learning models called diffusion models. They are generative models, meaning they are designed to generate new data similar to what they have seen in training. In the case of Stable Diffusion, the data are images.

Why is it called the diffusion model? Because its math looks very much like diffusion in physics. Let’s go through the idea.

(more…)

3Dprinting

Elegoo’s Best Resin 3D Printer – Saturn 4 Ultra 16K

pIXELsHAM.com

Jan 24, 2025

photogrammetry

GSTAR – Gaussian Surface Tracking and Reconstruction

pIXELsHAM.com

Jan 24, 2025

https://eth-ait.github.io/GSTAR

Watch this video on YouTube

A.I.

Meta DINOv2 – A Self-supervised Vision Transformer Model

pIXELsHAM.com

Jan 24, 2025

https://ai.meta.com/blog/dino-v2-computer-vision-self-supervised-learning

https://dinov2.metademolab.com

hardware, ves

Google buys part of HTC’s Vive VR team for $250 million

pIXELsHAM.com

Jan 24, 2025

https://www.engadget.com/big-tech/google-buys-part-of-htcs-vive-vr-team-for-250-million-130046567.html

3Dprinting, design

5 Amazing Ideas to Take Silicone Mold Making to a Whole New Level!

pIXELsHAM.com

Jan 23, 2025

A.I.

Hunyuan video-to-video re-styling

pIXELsHAM.com

Jan 20, 2025

The open-source community has figured out how to run Hunyuan V2V using LoRAs.

You’ll need to install Kijai’s ComfyUI-HunyuanLoom and LoRAs, which you can either train yourself or find on Civitai.

https://www.linkedin.com/posts/leokadieff_ai-generativeai-filmmaking-activity-7286521455448608769-abMg

1) you’ll need HunyuanLoom, after install, workflow found in the repo.
https://github.com/logtd/ComfyUI-HunyuanLoom

2) John Wick lora found here.
https://civitai.com/models/1131159/john-wick-hunyuan-video-lora

A.I., software

KlingAI – Kolors and Elements

pIXELsHAM.com

Jan 20, 2025

https://klingai.com

Watch this video on YouTube

Watch this video on YouTube

A.I., lighting

SynthLight – Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces

pIXELsHAM.com

Jan 20, 2025

https://vrroom.github.io/synthlight

A.I., modeling

Shapen – Pixels to polygons text-to-model

pIXELsHAM.com

Jan 20, 2025

https://shapen.com

A.I.

Seaweed APT – Diffusion Adversarial Post-Training for One-Step Video Generation

pIXELsHAM.com

Jan 20, 2025

https://seaweed-apt.com

https://cdn.seaweed-apt.com/assets/showreel/seaweed-apt.mp4

This demonstrate large-scale text-to-video generation with a single neural function evaluation (1NFE) by using our proposed adversarial post-training technique. Our model generates 2 seconds of 1280×720 24fps videos in real-time

composition, lighting

This ONE Step Makes CG Look Cinematic

pIXELsHAM.com

Jan 20, 2025

python

Pyper – a flexible framework for concurrent and parallel data-processing in Python

pIXELsHAM.com

Jan 18, 2025

Pyper is a flexible framework for concurrent and parallel data-processing, based on functional programming patterns.

https://github.com/pyper-dev/pyper

software, ves

Jacob Bartlett – Apple is Killing Swift

pIXELsHAM.com

Jan 18, 2025

https://blog.jacobstechtavern.com/p/apple-is-killing-swift

Jacob Bartlett argues that Swift, once envisioned as a simple and composable programming language by its creator Chris Lattner, has become overly complex due to Apple’s governance. Bartlett highlights that Swift now contains 217 reserved keywords, deviating from its original goal of simplicity. He contrasts Swift’s governance model, where Apple serves as the project lead and arbiter, with other languages like Python and Rust, which have more community-driven or balanced governance structures. Bartlett suggests that Apple’s control has led to Swift’s current state, moving away from Lattner’s initial vision.

photogrammetry

Don’t Splat your Gaussians – Volumetric Ray-Traced Primitives for Modeling and Rendering Scattering and Emissive Media

pIXELsHAM.com

Jan 18, 2025

https://arcanous98.github.io/projectPages/gaussianVolumes.html

We propose a compact and efficient alternative to existing volumetric representations for rendering such as voxel grids.

A.I.

IPAdapter – Text Compatible Image Prompt Adapter for Text-to-Image Image-to-Image Diffusion Models and ComfyUI implementation

pIXELsHAM.com

Jan 17, 2025

github.com/tencent-ailab/IP-Adapter

ip-adapter.github.io/

The IPAdapter are very powerful models for image-to-image conditioning. The subject or even just the style of the reference image(s) can be easily transferred to a generation. Think of it as a 1-image lora. They are an effective and lightweight adapter to achieve image prompt capability for the pre-trained text-to-image diffusion models. An IP-Adapter with only 22M parameters can achieve comparable or even better performance to a fine-tuned image prompt model.

Once the IP-Adapter is trained, it can be directly reusable on custom models fine-tuned from the same base model.

The IP-Adapter is fully compatible with existing controllable tools, e.g., ControlNet and T2I-Adapter.

Watch this video on YouTube

A.I.

Sony – Diffusion Training from Scratch on a Micro-Budget

pIXELsHAM.com

Jan 17, 2025

stability.ai/news/stable-point-aware-3d

huggingface.co/VSehwag24/MicroDiT

A.I.

GAGA – Group Any Gaussians via 3D-aware Memory Bank segmentation

pIXELsHAM.com

Jan 17, 2025

www.gaga.gallery/

https://github.com/weijielyu/Gaga

Watch this video on YouTube

software

Curl – Client for URLs is a free and open source command line tool for transferring data using various protocols

pIXELsHAM.com

Jan 16, 2025

curl.se/

www.keycdn.com/support/popular-curl-examples

A.I.

ComfyUI – Zero to hero with Cubiq (Matteo)

pIXELsHAM.com

Jan 15, 2025

A.I.

Kinetix.tech – Character motion control

pIXELsHAM.com

Jan 15, 2025

www.kinetix.tech/

www.kinetix.tech/character-motion-control-for-video-generation-models

A.I., modeling, photogrammetry

SPAR3D – Stable Point-Aware Reconstruction of 3D Objects from Single Images

pIXELsHAM.com

Jan 15, 2025

SPAR3D is a fast single-image 3D reconstructor with intermediate point cloud generation, which allows for interactive user edits and achieves state-of-the-art performance.

https://github.com/Stability-AI/stable-point-aware-3d

https://static1.squarespace.com/static/6213c340453c3f502425776e/t/677e3bc1b9e5df16b60ed4fe/1736326093956/SPAR3D+Research+Paper.pdf

https://stability.ai/news/stable-point-aware-3d?utm_source=x&utm_medium=social&utm_campaign=SPAR3D

Watch this video on YouTube

A.I.

MiniMax-01 goes open source

pIXELsHAM.com

Jan 15, 2025

MiniMax is thrilled to announce the release of the MiniMax-01 series, featuring two groundbreaking models:

MiniMax-Text-01: A foundational language model.
MiniMax-VL-01: A visual multi-modal model.

Both models are now open-source, paving the way for innovation and accessibility in AI development!

🔑 Key Innovations
1. Lightning Attention Architecture: Combines 7/8 Lightning Attention with 1/8 Softmax Attention, delivering unparalleled performance.
2. Massive Scale with MoE (Mixture of Experts): 456B parameters with 32 experts and 45.9B activated parameters.
3. 4M-Token Context Window: Processes up to 4 million tokens, 20–32x the capacity of leading models, redefining what’s possible in long-context AI applications.

💡 Why MiniMax-01 Matters
1. Innovative Architecture for Top-Tier Performance
The MiniMax-01 series introduces the Lightning Attention mechanism, a bold alternative to traditional Transformer architectures, delivering unmatched efficiency and scalability.

2. 4M Ultra-Long Context: Ushering in the AI Agent Era
With the ability to handle 4 million tokens, MiniMax-01 is designed to lead the next wave of agent-based applications, where extended context handling and sustained memory are critical.

3. Unbeatable Cost-Effectiveness
Through proprietary architectural innovations and infrastructure optimization, we’re offering the most competitive pricing in the industry:
$0.2 per million input tokens
$1.1 per million output tokens

🌟 Experience the Future of AI Today
We believe MiniMax-01 is poised to transform AI applications across industries. Whether you’re building next-gen AI agents, tackling ultra-long context tasks, or exploring new frontiers in AI, MiniMax-01 is here to empower your vision.

✅ Try it now for free: hailuo.ai

📄 Read the technical paper: filecdn.minimax.chat/_Arxiv_MiniMax_01_Report.pdf

🌐 Learn more: minimaxi.com/en/news/minimax-01-series-2

💡API Platform: intl.minimaxi.com/