Build Your Own Voice Calculator: A Beginner’s Guide
Creating a voice calculator is a rewarding beginner project that combines basic programming, speech recognition, and simple math parsing. This guide walks you through a minimal working Voice Calculator using web technologies (JavaScript, HTML) and the browser’s Web Speech API so you can speak calculations and get results quickly.
What you’ll build
A web page that:
- Listens for spoken math expressions (e.g., “twenty three plus seven”, “five times six”).
- Converts speech to text.
- Parses the text into a mathematical expression.
- Evaluates the expression safely and displays the result.
Tools and prerequisites
- Basic HTML, CSS, and JavaScript knowledge.
- A modern browser with Web Speech API support (Chrome/Edge).
- Optional: a text editor and local web server (not required — can open the HTML file directly).
Core approach (high level)
- Capture speech with the Web Speech API (SpeechRecognition).
- Normalize the recognized text (lowercase, remove filler words).
- Convert number words to digits and map spoken operators to symbols.
- Validate and evaluate the parsed expression safely.
- Display the result and handle edge cases (errors, unsupported phrases).
Minimal working example
Create a file named index.html and paste the following code:
html
<!doctype html> <html lang=“en”> <head> <meta charset=“utf-8” /> <title>Voice Calculator</title> <style> body { font-family: system-ui, Arial; padding: 24px; max-width: 720px; margin: auto; } button { padding: 12px 18px; font-size: 16px; } #transcript, #result { margin-top: 12px; font-size: 18px; } .error { color: #c00; } </style> </head> <body> <h1>Voice Calculator</h1> <button id=“start”>Start Listening</button> <div id=“transcript”>Transcript: <span id=“txt”>—</span></div> <div id=“result”>Result: <span id=“res”>—</span></div> <script> // Feature-detect const SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition; if (!SpeechRecognition) { document.getElementById(‘res’).textContent = ‘SpeechRecognition not supported in this browser.’; document.getElementById(‘res’).classList.add(‘error’); document.getElementById(‘start’).disabled = true; } else { const rec = new SpeechRecognition(); rec.lang = ‘en-US’; rec.interimResults = false; rec.maxAlternatives = 1; document.getElementById(‘start’).addEventListener(‘click’, () => rec.start()); rec.addEventListener(‘result’, (ev) => { const text = ev.results[0][0].transcript; document.getElementById(‘txt’).textContent = text; try { const expr = parseSpokenExpression(text); const value = evaluateExpression(expr); document.getElementById(‘res’).textContent = value; } catch (err) { document.getElementById(‘res’).textContent = ‘Error: ‘ + err.message; document.getElementById(‘res’).classList.add(‘error’); } }); rec.addEventListener(‘end’, () => { /* ready for next / }); } // Basic number-word to numeric mapping (supports 0-999) const SMALL = { ‘zero’:0,‘one’:1,‘two’:2,‘three’:3,‘four’:4,‘five’:5,‘six’:6,‘seven’:7,‘eight’:8,‘nine’:9, ‘ten’:10,‘eleven’:11,‘twelve’:12,‘thirteen’:13,‘fourteen’:14,‘fifteen’:15,‘sixteen’:16,‘seventeen’:17,‘eighteen’:18,‘nineteen’:19 }; const TENS = { ‘twenty’:20,‘thirty’:30,‘forty’:40,‘fifty’:50,‘sixty’:60,‘seventy’:70,‘eighty’:80,‘ninety’:90 }; function wordsToNumber(words) { // supports numbers like “one hundred twenty three” or “forty five” const parts = words.split(/[\s-]+/); let total = 0, current = 0; parts.forEach(p => { if (SMALL[p] !== undefined) current += SMALL[p]; else if (TENS[p] !== undefined) current += TENS[p]; else if (p === ‘hundred’) current = 100; else if (p === ‘thousand’) { current = 1000; total += current; current = 0; } else throw new Error(‘Unknown number word: ‘ + p); }); return total + current; } function normalize(text) { return text.toLowerCase() .replace(/what is|calculate|equals|equal to|please|hey|ok|and/g,“) .replace(/[^a-z0-9\s.-]/g,’ ‘) .replace(/\s+/g,’ ‘).trim(); } function parseSpokenExpression(text) { const cleaned = normalize(text); // operator mapping const ops = { ‘plus’:’+’,‘add’:’+’,‘added to’:’+’, ‘minus’:’-’,‘subtract’:’-’,‘less’:’-’, ‘times’:‘’,‘multiplied by’:‘’,‘multiply’:‘’,‘x’:‘’, ‘divided by’:’/’,‘over’:’/’,‘divide’:’/’ }; // Try simple “number operator number” first // Build regex dynamically from ops keys const opKeys = Object.keys(ops).sort((a,b)=>b.length-a.length).map(k=>k.replace(/ /g,‘\s+’)); const re = new RegExp(</span><span class="token script language-javascript template-string" style="color: rgb(163, 21, 21);">^(.+)\\s+(</span><span class="token script language-javascript template-string interpolation interpolation-punctuation">${</span><span class="token script language-javascript template-string interpolation">opKeys</span><span class="token script language-javascript template-string interpolation">.</span><span class="token script language-javascript template-string interpolation">join</span><span class="token script language-javascript template-string interpolation">(</span><span class="token script language-javascript template-string interpolation">'|'</span><span class="token script language-javascript template-string interpolation">)</span><span class="token script language-javascript template-string interpolation interpolation-punctuation">}</span><span class="token script language-javascript template-string" style="color: rgb(163, 21, 21);">)\\s+(.+)$</span><span class="token script language-javascript template-string template-punctuation">); const m = cleaned.match(re); if (!m) throw new Error(‘Could not parse expression. Try “five plus two” or “twenty three divided by seven”.’); const left = m[1].trim(); const opWord = m[2].replace(/\s+/g,’ ‘); const right = m[3].trim(); const op = ops[opWord] || (()=>{ throw new Error(‘Unsupported operator: ‘+opWord) })(); const leftNum = isNaN(Number(left)) ? wordsToNumber(left) : Number(left); const rightNum = isNaN(Number(right)) ? wordsToNumber(right) : Number(right); return</span><span class="token script language-javascript template-string interpolation interpolation-punctuation">${</span><span class="token script language-javascript template-string interpolation">leftNum</span><span class="token script language-javascript template-string interpolation interpolation-punctuation">}</span><span class="token script language-javascript template-string interpolation interpolation-punctuation">${</span><span class="token script language-javascript template-string interpolation">op</span><span class="token script language-javascript template-string interpolation interpolation-punctuation">}</span><span class="token script language-javascript template-string interpolation interpolation-punctuation">${</span><span class="token script language-javascript template-string interpolation">rightNum</span><span class="token script language-javascript template-string interpolation interpolation-punctuation">}</span><span class="token script language-javascript template-string template-punctuation">; } function evaluateExpression(expr) { // Very simple safety: only digits, operators, decimal and parentheses allowed if (!/^[0-9+-/().\s]+$/.test(expr)) throw new Error(‘Unsafe expression.’); // eslint-disable-next-line no-eval const val = eval(expr); if (!isFinite(val)) throw new Error(‘Result is not finite.’); return val; } </script> </body> </html>
How it works (brief)
- SpeechRecognition captures spoken text.
- normalize() removes filler words and punctuation.
- parseSpokenExpression() maps spoken operators to symbols and converts number words to numeric values.
- evaluateExpression() safely checks the expression string and uses eval for simplicity (acceptable for this tutorial; see notes below).
Extensions and improvements
- Support larger numbers, decimals, negatives, parentheses, and multi-term expressions.
- Replace eval with a proper math expression parser (e.g., math.js) for security and richer features.
- Add continuous listening and UI indicators (listening, processing).
- Add voice output using SpeechSynthesis for spoken results.
- Improve language support and fuzzy matching for more natural phrases.
Safety notes
- The demo uses a basic safety regex and eval — adequate for controlled inputs but not production-safe. Use a proper parser or sandboxed evaluator for real deployments.
That’s it — a working starter you can run in your browser. Modify the parsing logic to expand supported phrases and operators as you learn.
Leave a Reply