Some thoughts about ASR

Victor TanJuly 30, 2023Post a Comment

What you’re reading this the result of dictation from an M2 Max Macbook Pro, to which I upgraded after some time of pretty much just realizing that I needed a new computer.

I don’t know how good it is going to be, but I have expectations that it will be a little better compared to what we have on the iPhone and other devices that we can use as part of the Apple system. So far, it seems there are some issues with the system because it recognizes some words incorrectly – partly that could be because I am pronouncing those words in a way that is not really concordant with what the algorithm is able to imitate. Hence, it ends up transcribing the wrong things, because what it hears is, in fact, something incomprehensible.

When it comes to automatic speech recognition, the algorithms that process our speech face problems when the expected input is ambiguous and can possibly match several possible outputs. The automatic speech recognition function is, after all, based on making predictions, based on the likely input that would be expected from a user over time. In the event that the prediction made by the algorithm is incorrect, the software may be penalized as a result of the input not matching the prediction. The ultimate goal is to have a scenario where every prediction generated by the model, as the person speaks, matches what the person intended to say with the least possible corrections required.

Let’s start with the voice’s pitch and ambient noise.

For an algorithm to receive the correct input, the sound it receives should match the specific waveforms used in the training dataset that guided the development of the automatic speech recognition system. If there is a deviation in the sound pattern, either due to ambient noise or tone of voice, the generated text can be inaccurate because the system receives problematic inputs in the first place. This is natural, because if you feed garbage in, you will naturally receive garbage out. There is a sort of equivalent exchange at play.

Let’s now talk about something different altogether – the device’s processing power. I’m not sure if this is a factor with the automatic speech recognition system on Apple devices, but I suspect it is. Each device needs to perform complex calculations that allow it to make predictions at a relatively high rate of 100 to 130 words per minute. If you look into the memory usage of the computer as it performs this process, you may see that the memory does not get consumed at a high degree – that is something I plan to test in the coming days.

The last possibility is that there is a problem with the algorithm itself in recognizing certain patterns of waves. There can be some variation depending on the quality of the input, but it’s also possible that the algorithm used to process the data can make mistakes on occasion. I’m confident that Apple is making strides to improve the output quality of its algorithms, and for that reason, I am optimistic about the improvements they can bring about.

I give this much thought because it is one of the most important aspects of any generative artificial intelligence system. These systems rely on good input into the algorithms, and speech recognition systems are extremely important as sources of input. As we interact with natural language on a daily basis, which is often faster than typing or pressing buttons on our devices, I believe that accurate dictation and procedural proofreading of our daily writing will lead us to a new era of AI.

I can’t wait to see what the future holds, particularly as we approach the end of 2023 with developments like iOS 17, AI generators, and all the different forms of technology that are becoming more prevalent. Time passes, and it is somewhat sad to think that we are coming closer to the end of our lives before these things come to fruition – still, the show is not over until it is over, and I can’t wait to see what is going to come!

A Small Written Piece… About Writing.

I write insane amounts nowadays – it’s because my brain has started moving so quickly that now writing thoughts down has become a natural occurrence, almost like breathing air or drinking water. Just think about it. Sepupunomics. EnglishFirstLanguage. My YouTube channel. Scripts. Descriptions. Essays. Posts. Everything. How is it possible to handle all of that unless your brain is indeed accelerating insanely? Or maybe, there’s an alternate explanation – maybe I just feel like my brain is moving faster, and the reality is that I just now have a thicker skin and mere human opinions don’t concern me, if we can say that. I suppose that in itself is interesting, because it reshapes human behavior — If you don’t really care that much what people are going to think of you, you’re not likely to be very restrained when it comes to writing, talking, yapping, and feeling yourself through this glorious and strange array of words. The net result? You practice, you practice, and you practice far more than other people. Even as we speak now, I am confident that the sheer number of words that I have written trespasses beyond what is reasonable, normal, or even understandable for most human beings, and I continue to write every single day. How many of these words will actually be read by people? Who’s to say, who’s to know, who’s to care? This is just an expression of who I am – so as water is wet, the Earth rotates, and gravity exists, I will write, and so move forward as who I am, a letter and a keystroke at a time.

July 2, 2025

Malaysian Prime Minister Tier List

It is quite normal for people to talk about politicians, and coffee shop talk is an everyday thing in our beautiful Tanah Tercinta – but I for one think coffee shop talk alone would be a little too boring… Which is why rather than just engaging in coffee shop talk, I thought it would be interesting to grade them. Which is why just the other day, my friends Vinodh and MJ from The Good Cast Show and FIRL did a new fancy collab – it’s a Prime Minister Tier List, and I’m very happy to share it with you! It was a great conversation with some very knowledgeable people (let me not include myself in that, and I’ll let you assess that for yourself!) who had also interviewed me before (for their respective podcasts), it was an awesome vibe of a chat, and it was an honor to learn with and from you! Come (virtually) hang out, and see you there! Also, I’m conducting a live poll (ends in six days!) for all of us to decide on an all-time ranking of our Malaysian Prime Ministers – join the fun and vote here! Link: https://live.tiermaker.com/63128277

July 1, 2025July 2, 2025

No, ChatGPT is NOT making you stupid.

Sepupus, the internet has been abuzz of late because of a new MIT study called “Your Brain on ChatGPT”. All around on Reddit and the internet, people are starting to form wild conclusions, read patterns in the stars, decide unilaterally or with the agreement of some people out there and everywhere, that somehow now people are being made stupid and MIT researchers have said that it is so and therefore it must be true. I find it interesting and fascinating. Now, in what way is this related to economics if at all? Well, artificial intelligence is a very important part of our economy and it will continue to be important for the foreseeable future, as it shapes and reshapes the economy and how we treat human capital in ways that are intuitive and sometimes unintuitive, in ways more subtle and interesting than the standard narrative of robots replacing human beings may suggest. It’s interesting to think about it and how it’s going to affect the way that we can live and work in this world which is ever-changing and continually evolving. With that in mind, here’s my perspective. I do not generally think that ChatGPT is making us stupid. I read the MIT study earlier, and I broadly understand the way that it is constructed. You can have a look at it here. Link: https://arxiv.org/pdf/2506.08872 Basically, what they did was that they asked participants to write SAT-style essays across three sessions chosen from a range of choices in three different groups: 1. One purely using their brains 2. One using Google 3. One using ChatGPT Then, they had some participants come back for a fourth session where they swapped people from one group to another — 18 people did this in total. Now this is what ChatGPT says, in summarizing what happened: (AI generated – also, as a full disclosure, I do […]

June 19, 2025June 19, 2025

Harvard Derangement Syndrome

We all know the difficulties that Harvard has been going through, and I thought that it would be fun to showcase an actual Harvard perspective, so I’m sharing this free article from the New York Times to all of you written by Steven Pinker, from my own subscription. It is well worth reading, and I hope you will enjoy it if you choose to read it! Link: https://www.nytimes.com/2025/05/23/opinion/harvard-university-trump-administration.html?rsrc=ss&unlocked_article_code=1.KE8.FQW2.LxEovGin6Ef6&smid=nytcore-ios-share&referringSource=articleShare Pinker is a disarming man. If you read his articles, they are quirky yet intellectually engaging. The man stuffs so many different facts into a single paragraph that it often makes me wonder how or whether he just has access to all of the ideas he does, articulating within a single hand wave expressions and fires of the most deeply interconnected set of neurons I may have ever witnessed on the planet. Well, at least that’s what I feel having read Pinker for quite a number of years now – And not knowing that he was the Johnstone Professor of Psychology at Harvard University Well, that’s just a lack of attention to detail on my part, but it’s an interesting reality Sometimes people may have done or know far more than you might even think, perceive, or understand And sometimes these surprises can be rather fascinating. Read the essay and it will give you a picture of what I understand about elite universities in the US at this point – Not exactly woke madrasas or the very headquarters of the CCP as President Trump seems to suggest, but instead as something rather different, definitely vibrant albeit with its flaws, where strident opinions are often shared, becoming the very voice of a generation through nothing more than the saliency bias and social media even amid an admitted climate where certain ideas are put to rest not because they are bad ones, but instead because […]

May 27, 2025

Royal Society Interview

Very honored to have the chance to interview the very first Malaysian scientist to join Britain’s Royal Society soon. Looking forward to meeting you soon, Ms. Ravigadevi! What questions should I ask and what are you curious about? Let me know down in the comments!

May 26, 2025

PKR Deputy Presidency Election Results Analysis

Some of you who follow me on YouTube know that I’ve been conducting some coverage of the PKR Deputy President elections featuring former deputy President Rafizi Ramli, and incoming deputy President Nurul Izzah. Sometimes it’s good to take a moment to think about the events that have happened over the course of the past, to understand things a little deeper, so I decided to do an analysis of the election results, which I’m sure many Malaysians were following. It is my first time doing this, and I will share my thought process along the way. When I look at the vote totals and also who got how many votes, I realize that we have been told earlier that there were about 32,030 people who were eligible to vote. Yet, at the same time, when we added together the votes cast for Rafizi and also Nurul Izzah, the total was only 13,669. This was a 42.7% turnout. Now, this was significantly better compared to previous PKR elections during which the turnouts ranged from about 10–15%. But thinking about that made me realize something important: Firstly, Nurul Izzah only has about 30% of the vote and she does not have a strong mandate. Second of all, this system made it so that what we see seems to be a highly improbable result. Now, some of you may know that PKR recently moved over to a delegate system. The way that it works is that there are 220 divisions of PKR and they all select a certain number of delegates to end up making up the total pool of people who are eligible to vote. In other words, this is not a random sample – This is not the general population. Indeed, if it were, and we were dealing with just your average everyday social media poll, it is almost a foregone conclusion that […]

May 24, 2025May 24, 2025

Victor Tan

Tags

Victor Tan

Some thoughts about ASR

Leave A Comment Cancel reply

A Small Written Piece… About Writing.

Malaysian Prime Minister Tier List

No, ChatGPT is NOT making you stupid.

Harvard Derangement Syndrome

Royal Society Interview

PKR Deputy Presidency Election Results Analysis

Search Here ….

Tags

Victor Tan

Some thoughts about ASR

Leave A Comment Cancel reply

Recommended Posts

A Small Written Piece… About Writing.

Malaysian Prime Minister Tier List

No, ChatGPT is NOT making you stupid.

Harvard Derangement Syndrome

Royal Society Interview

PKR Deputy Presidency Election Results Analysis