Add Replicate demo

2 years ago · 64097bd19b
parent 81a10937c7
commit 64097bd19b
3 changed files with 48 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -7,6 +7,7 @@
 Official repository for the paper [Robust High-Resolution Video Matting with Temporal Guidance](https://peterl1n.github.io/RobustVideoMatting/). RVM is specifically designed for robust human video matting. Unlike existing neural models that process frames as independent images, RVM uses a recurrent neural network to process videos with temporal memory. RVM can perform matting in real-time on any videos without additional inputs. It achieves **4K 76FPS** and **HD 104FPS** on an Nvidia GTX 1080 Ti GPU. The project was developed at [ByteDance Inc.](https://www.bytedance.com/)
 <br>
 <a href="https://replicate.com/arielreplicate/robust_video_matting"><img src="https://replicate.com/arielreplicate/robust_video_matting/badge"></a>
 ## News
@ -34,7 +35,7 @@ All footage in the video are available in [Google Drive](https://drive.google.co
 ## Demo
 * [Webcam Demo](https://peterl1n.github.io/RobustVideoMatting/#/demo): Run the model live in your browser. Visualize recurrent states.
 * [Colab Demo](https://colab.research.google.com/drive/10z-pNKRnVNsp0Lq9tH1J_XPZ7CBC_uHm?usp=sharing): Test our model on your own videos with free GPU. 
-
+* [Replicate Demo](https://replicate.com/arielreplicate/robust_video_matting): Test our model on Replicate UI/python API.
 <br>
 ## Download
--- a/cog.yaml
+++ b/cog.yaml
@ -0,0 +1,14 @@
 build:
  gpu: true
  python_version: 3.8
  system_packages:
    - libgl1-mesa-glx
    - libglib2.0-0
  python_packages:
    - torch==1.9.0
    - torchvision==0.10.0
    - av==8.0.3
    - tqdm==4.61.1
    - pims==0.5
 predict: "predict.py:Predictor"
--- a/predict.py
+++ b/predict.py
@ -0,0 +1,32 @@
 import torch
 from model import MattingNetwork
 from inference import convert_video
 from cog import BasePredictor, Path, Input
 class Predictor(BasePredictor):
    def setup(self):
        self.model = MattingNetwork('resnet50').eval().cuda()
        self.model.load_state_dict(torch.load('rvm_resnet50.pth'))
    def predict(
            self,
            input_video: Path = Input(description="Video to segment."),
            output_type: str = Input(default="green-screen", choices=["green-screen", "alpha-mask", "foreground-mask"]),
    ) -> Path:
        convert_video(
            self.model,  # The model, can be on any device (cpu or cuda).
            input_source=str(input_video),  # A video file or an image sequence directory.
            output_type='video',  # Choose "video" or "png_sequence"
            output_composition='green-screen.mp4',  # File path if video; directory path if png sequence.
            output_alpha="alpha-mask.mp4",  # [Optional] Output the raw alpha prediction.
            output_foreground="foreground-mask.mp4",  # [Optional] Output the raw foreground prediction.
            output_video_mbps=4,  # Output video mbps. Not needed for png sequence.
            downsample_ratio=None,  # A hyperparameter to adjust or use None for auto.
            seq_chunk=12,  # Process n frames at once for better parallelism.
        )
        output_type = str(output_type)
        return Path(f'{output_type}.mp4')