You're looking at a specific version of this model. Jump to the model overview.
zsxkib /audio-flamingo-3:2856d42f
            
              
                
              
            
            Input schema
          
        The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
| Field | Type | Default value | Description | 
|---|---|---|---|
| audio | 
           
            string
            
           
         | 
        
           
            Audio file to analyze. Supports speech, music, and sound effects. Maximum duration: 10 minutes.
           
         | 
      |
| prompt | 
           
            string
            
           
         | 
        
          
             
              Please describe this audio in detail.
             
          
          
          
         | 
        
           
            Question or instruction about the audio
           
         | 
      
| system_prompt | 
           
            string
            
           
         | 
        
          
             | 
        
           
            System instructions to customize the model's behavior, output format, or analysis style. Leave empty for default behavior.
           
         | 
      
| enable_thinking | 
           
            boolean
            
           
         | 
        
          
             
              False
             
          
          
          
         | 
        
           
            Enable detailed chain-of-thought reasoning for complex analysis. False for faster responses, True for deeper insights.
           
         | 
      
| temperature | 
           
            number
            
           
         | 
        
          
             
              0.3
             
          
          
          
            Max: 1  | 
        
           
            Controls response creativity and randomness. Lower values (0.1-0.3) for factual analysis, higher values (0.7-0.9) for creative interpretation.
           
         | 
      
| max_length | 
           
            integer
            
           
         | 
        
          
             
              1024
             
          
          
          
            Min: 50 Max: 2048  | 
        
           
            Maximum length of the response in tokens. Shorter for concise answers, longer for detailed analysis.
           
         | 
      
| start_time | 
           
            number
            
           
         | 
        
           
            Start time in seconds for audio segment analysis (optional). Useful for long audio files.
           
         | 
      |
| end_time | 
           
            number
            
           
         | 
        
           
            End time in seconds for audio segment analysis (optional). Must be greater than start_time.
           
         | 
      
            
              
                
              
            
            Output schema
          
        The shape of the response you’ll get when you run this model with an API.
              Schema
            
            {'title': 'Output', 'type': 'string'}