{
  "hq": [
    {
      "speaker": 0,
      "text": "Okay. Hello. Hello. Hello.",
      "start": 3.1999998,
      "end": 8.5
    },
    {
      "speaker": 1,
      "text": "Hello.",
      "start": 9.12,
      "end": 9.62
    },
    {
      "speaker": 2,
      "text": "English right now. They do Chinese right now.",
      "start": 14.095,
      "end": 16.755001
    },
    {
      "speaker": 1,
      "text": "It does Chinese or English. Chinese and English results in very dubious performance.",
      "start": 17.855,
      "end": 24.595001
    },
    {
      "speaker": 0,
      "text": "I had",
      "start": 28.74,
      "end": 29.06
    },
    {
      "speaker": 1,
      "text": "to switch everything to, William, it was a tragedy. So you know how Deepgram says they're better than whisper? It's because Deepgram has a non zero performance in foreign languages and in code switch in code switch audio, whereas the whisper just returns empty stream if there's code switching. Okay.",
      "start": 29.06,
      "end": 48.595
    },
    {
      "speaker": 2,
      "text": "I see.",
      "start": 48.595,
      "end": 49.155
    },
    {
      "speaker": 1,
      "text": "Whisper can't do code switching. It returns empty stream.",
      "start": 49.155,
      "end": 51.975
    },
    {
      "speaker": 0,
      "text": "We're using Deepgram. Right?",
      "start": 52.515,
      "end": 54.135002
    },
    {
      "speaker": 1,
      "text": "I switch whisper is actually better for English. So we're",
      "start": 54.390003,
      "end": 57.83
    },
    {
      "speaker": 2,
      "text": "Actually, Bailey. Bailey. Bailey. Bailey. You should do a default portfolio, and, that, like, if a person choose the person's main language is Chinese or something, can switch to a different model. And, also, in the future, it can be automatically switched because you can detect the conversation scene and switch it kind of routing in between.",
      "start": 57.91,
      "end": 76.495
    },
    {
      "speaker": 1,
      "text": "You can't detect the language until you transcribe it.",
      "start": 77.195,
      "end": 80.335
    },
    {
      "speaker": 2,
      "text": "Mhmm. Yeah.",
      "start": 80.475,
      "end": 81.295
    },
    {
      "speaker": 0,
      "text": "I think we can default to Deepgram and then save some metadata. Like, save the metadata of Yeah.",
      "start": 81.354996,
      "end": 87.52
    },
    {
      "speaker": 1,
      "text": "I guess you can if you the techno type, then you can retranscribe it with for to get better accuracy or something. Right?",
      "start": 87.52,
      "end": 92.96
    },
    {
      "speaker": 0,
      "text": "Like, if you keep getting, like, you know, like, for the past 7 days, this person only used Deepgram to transcribe English, then we can probably Yeah. Yeah. That would be the switch to whisper.",
      "start": 92.96,
      "end": 104.265
    },
    {
      "speaker": 1,
      "text": "Okay. Now you can stop the recording unless whether we",
      "start": 105.21638,
      "end": 107.69637
    },
    {
      "speaker": 0,
      "text": "can Okay.",
      "start": 107.69637,
      "end": 108.096375
    },
    {
      "speaker": 2,
      "text": "Let me say something because, otherwise, it's all you. So it's clearly you and I have different voice. Right? Okay. Stop.",
      "start": 108.096375,
      "end": 114.651436
    }
  ]
}