• Rhaedas@fedia.io
    link
    fedilink
    arrow-up
    4
    arrow-down
    1
    ·
    9 days ago

    I stopped at speech recognition. That’s the only important part of this that needs to involve any complex AI. The rest is basic programming and doesn’t need a neural net at all. Think of a modern phone tree. There’s some things that will get recognized as menu items, and anything beyond gets dumped to the human still monitoring the activity.

    Good luck with the first part though. Having worked drive thru in my days (long ago, but nothing much has changed) the noise level on the input is all over the place. The human ear is very good at picking out things, so usually you can piece together what the order was, but even today’s phone trees that I mentioned or even a smartphone that’s tied to Google/Siri in real time can screw basic words up, and that’s from an algorithm that learns the user’s voice, not a different person with a different accent/ volume, car sound, etc. each time.

    Also let me add - even though the human ear is far superior at picking out nuances in a high noise environment, many people working the drive thru still suck at understanding even clear speech. To get AI/LLM/whatever past even the basic incompetent human order taker will be a monumental accomplishment that could filter into so many other things. Read that as sarcastic - it hasn’t happened yet, it won’t happen via a drive thru cost replacement to increase bottom line profits.