hamelsmu / genstruct-7b

Genstruct 7B is an instruction-generation model, designed to create valid instructions given a raw text corpus. This enables the creation of new, partially synthetic instruction finetuning datasets from any raw-text corpus.

  • Public
  • 171 runs

Run time and cost

This model costs approximately $0.035 to run on Replicate, or 28 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 49 seconds.

Readme

Hosted version of NousResearch/Genstruct-7B

Files used to host this model are on GitHub.

This model is hosted with vLLM with the following code:

import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
import torch
from cog import BasePredictor
from vllm import LLM, SamplingParams

def prompt(title, content):
    return f"""[[[Title]]] {title}

[[[Content]]] {content}

The following is an interaction between a user and an AI assistant that is related to the above text.

[[[User]]] """

class Predictor(BasePredictor):

    def setup(self):
        n_gpus = torch.cuda.device_count()
        self.llm = LLM(model='NousResearch/Genstruct-7B', 
                       tensor_parallel_size=n_gpus)

    def predict(self, title: str, content: str, temp:float=0.0, max_tokens:int=2000) -> str:        
        _p = prompt(title, content)
        sampling_params = SamplingParams(temperature=temp, ignore_eos=True, max_tokens=max_tokens)
        out = self.llm.generate(_p, sampling_params=sampling_params, use_tqdm=False)
        return out[0].outputs[0].text

For more information, see these docs.