File size: 4,669 Bytes
c5d892b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
import numpy as np
from PIL import Image, ImageDraw
import cv2
import os
import tempfile

def create_demo_mask_from_image(image_shape, center_x=0.5, center_y=0.5, radius=0.3):
    """Create a circular mask in the center of the image"""
    height, width = image_shape[:2]
    center_x = int(width * center_x)
    center_y = int(height * center_y)
    radius = int(min(width, height) * radius)
    
    # Create mask
    mask = np.zeros((height, width), dtype=np.uint8)
    y, x = np.ogrid[:height, :width]
    mask_area = (x - center_x) ** 2 + (y - center_y) ** 2 <= radius ** 2
    mask[mask_area] = 255
    
    return mask

def validate_image_dimensions(image, max_size=2048):
    """Validate and resize image if needed"""
    height, width = image.shape[:2]
    
    if max(height, width) > max_size:
        scale = max_size / max(height, width)
        new_height = int(height * scale)
        new_width = int(width * scale)
        
        resized = cv2.resize(image, (new_width, new_height), interpolation=cv2.INTER_AREA)
        print(f"Image resized from {width}x{height} to {new_width}x{new_height}")
        return resized
    return image

def prepare_image_for_inference(image):
    """Prepare image for inference pipeline"""
    if len(image.shape) == 3 and image.shape[2] == 3:
        # Ensure RGB format
        if image.dtype != np.uint8:
            image = (image * 255).astype(np.uint8)
        return image
    else:
        raise ValueError("Image must be RGB format")

def save_temporary_file(data, suffix=".png"):
    """Save data to a temporary file"""
    if isinstance(data, np.ndarray):
        if len(data.shape) == 2:
            # Grayscale image
            img = Image.fromarray(data, mode='L')
        else:
            # RGB image
            img = Image.fromarray(data)
    else:
        img = data
    
    with tempfile.NamedTemporaryFile(suffix=suffix, delete=False) as temp_file:
        img.save(temp_file.name)
        return temp_file.name

def cleanup_temp_files(temp_paths):
    """Clean up temporary files"""
    for path in temp_paths:
        try:
            if os.path.exists(path):
                os.unlink(path)
        except Exception as e:
            print(f"Warning: Could not delete temporary file {path}: {e}")

def get_inference_status():
    """Check if inference modules are available"""
    try:
        from inference import Inference, load_image, load_single_mask
        return True, "Inference modules available"
    except ImportError as e:
        return False, f"Inference modules not available: {e}"

def format_file_size(size_bytes):
    """Format file size in human readable format"""
    if size_bytes < 1024:
        return f"{size_bytes} B"
    elif size_bytes < 1024**2:
        return f"{size_bytes/1024:.1f} KB"
    elif size_bytes < 1024**3:
        return f"{size_bytes/(1024**2):.1f} MB"
    else:
        return f"{size_bytes/(1024**3):.1f} GB"

def create_sample_mask_options():
    """Create sample mask creation options"""
    return [
        ("No mask", None),
        ("Center circle", "circle_center"),
        ("Center ellipse", "ellipse_center"),
        ("Full image", "full"),
    ]
This Gradio application provides:

## Key Features:

1. **Professional UI**: Modern interface with gradient header and clear sections
2. **Image Upload**: Drag-and-drop or click to upload images
3. **Optional Mask Upload**: Upload segmentation masks for focused processing
4. **Configuration Options**: Adjustable random seed and model selection
5. **Real-time Status**: Progress updates and error handling
6. **Download Functionality**: Direct download of generated 3D models
7. **Demo Mode**: Works even without the inference module installed
8. **Error Handling**: Robust error management with user-friendly messages

## Usage Instructions:

1. **Setup**: Make sure to clone the SAM-3D-objects repository and install dependencies
2. **Image Upload**: Upload the image you want to convert to 3D
3. **Mask (Optional)**: Upload a mask for better segmentation results
4. **Configure**: Adjust the random seed if needed
5. **Generate**: Click the generate button to create your 3D model
6. **Download**: Save the generated PLY file when complete

The app includes a "Built with anycoder" link as requested and provides a complete working interface for image-to-3D conversion using the SAM-3D-objects inference pipeline.

**Important**: Before running, make sure to:
1. Clone the repository: `git clone https://github.com/facebookresearch/sam-3d-objects`
2. Install dependencies as per the repository requirements
3. Ensure the model checkpoints are available in the `checkpoints/` directory