[Go to Scriven speech macros page]

Speech Command and Control

 
This is a slightly modified version of the talk I gave to the Boston voice users group on March 12, 2002 about using speech to speed up computer command and control. The talk includes a demonstration of a set of mouse macros allow you to place include the mouse, then do something else, all in one command. The "further thoughts" section at the bottom is new, and was sparked by some of the discussion after the talk.
-Kim Patch

I'm here to talk about controlling computers efficiently using speech recognition, and I'll do that in a minute. But first I want you to humor me. I want to teach you how to juggle. The key to juggling is to start with one ball... then when you get comfortable with one, add one more... when two are easy, then you can add the third one.

Okay, now we can talk about speech.

The key to learning how to use voice recognition is to recognize that the interface is something quite different and really learn it.

There's an illusion working against this, however. Because we're so used to communicating using words and because the computer now has the ability to hear commands, it seems like everything should run smoothly right away.

Typewriters presented a similar illusion, although it probably wasn't as strong. I'm willing to bet that the first people to use typewriters had no idea it was possible press buttons to produce even 50 words a minute much less the 120 or more that some typists do today. But once the locations of those keys became instinctive, everything sped up.

Because language is already hardwired into our brains, it seems like it would be a simple matter to use words to command a computer.

Have you ever found yourself in a conversation with people whose profession or interests you are not familiar with? Whether they are physicists or architects or foodservice professionals, they'll inevitably use jargon when talking about their profession.

If you are talking to a couple of musicians and you're not familiar with the words they use to refer to attributes of music, they can include you in the conversation by throwing in some extra phrases to explain that Diminnuendo means a gradual decrease in volume, Ostinato refers to a pattern of notes repeated many times, and Rubato means a flexible, rather than strict tempo. Although this may at first slow the conversation, eventually they'll begin repeating the jargon they've already explained, and the conversation will start to speed up.

One of the keys to using voice is to recognize that using language to control machines is something new that could be made much more efficient by establishing shortcuts -- a jargon of sorts.

At the same time, because computers are quite different from people, the process of working out a mutually acceptable vocabulary is also inevitably different. The computer's command and control vocabulary is very limited, especially when you want to tell it to do something that requires several steps -- like moving a folder. This changes verbal communications considerably. There's a second variable at work as well.

Have you ever commanded a person who is using a computer? It's usually slower than just typing in the commands yourself. But after you work with the person for awhile, and work out some conventions it speeds up at least a little. Working with a person in any capacity is similar -- athletes get better as a team when they practice together. And of course, in talking with you, the musicians got you up to speed with the language of musical tempos.

Computers, once again, are very different. The computer will not work with you.

You worked with me with the juggling. You adapted by humoring me even though it might not have been exactly what you expected. A computer wouldn't have. The input would not have been recognized if it wasn't expected. The computer is going to act the same no matter how you act -- you have to work out all the changes to speed the two of you up.

Although this is fairly obvious intellectually, it's not necessarily obvious instinctively because the pieces aren't new. We're used to communicating with machines that don't adapt, although we usually do this using buttons. And we're used to communicating using speech, but that's with humans and maybe a pet or two, and they adapt.

And here's another thing about speech -- one of the beautiful things about language is all the choices we have about how to phrase things in order to evoke subtle shades of meaning. The possibilities in combining words are nearly endless. Computers do not appreciate the vagaries and subtleties of language. This is ironic of course, because they evoke so many colorful word combinations themselves, but they really don't appreciate language.

So here's the situation. You and your computer are attempting a new type of communication, and the two of you are very different.

The good news, however, is even within the limitations, controlling a computer using words has many more possibilities than using a keyboard. Even though a computer's ability to recognize combinations of words is more limited than ours, it still increases the number of practical commands by several orders of magnitude compared to the keyboard.

A computer can recognize many more words than keys, and we can use words in larger combinations than are practical using keys on a keyboard.

So my entire point here is that, in theory, using speech commands to control a computer is much better than using a keyboard. But at the same time it is a much worse communication than the speech we're used to. Therefore it's something new that we have to learn, and, like riding a bike, it's going to be slower at first, but much faster in the end.

With this in mind, I think the best way to work within these limitations is to choose a set of commands and work with them until they're instinctive. Although this is more boring than talking to a person, making a command instinctive means you don't have to think about it, which speeds things up in a way similar to learning the keys on the typewriter so you don't have to look or even think about which is which.

This is a fairly natural process. When you work this way, it will quickly seem very familiar. We get so used to clichés, for instance, that sometimes we say them without even knowing what the words literally mean. How many people haven't heard "get your goat", "the third degree" or "hell bent for leather?" Anybody really know what they mean? Why does it have to be goat? Why not a chicken, why not the 5th degree, why not wool?

Another way we make speech easier to use when we're talking to people is we regularly use shortcuts. How many people here are always called by their full given names? If there are terms you use a lot -- and names of people and places are good examples -- you usually use the shortest forms you can get away with, especially when you have to refer to them regularly. We do this automatically just because it's easier.

And when you are giving directions you may take a long time telling someone how to get to a place, but then when you or your listener sum it up at the end you use a telescoped type of grammar that is consistent and concise -- right at the second light, left on Brattle, third left is Park. One of the reasons it can be so concise is the context is narrow -- you and your listener know you are talking about how to get from one place to another.

The radio traffic report is another good example. They're working within the confines of having to communicate with a large audience, but they are good at being concise. They've introduced some language shortcuts that are somewhat intuitive in order to more quickly tell you how heavy the traffic is on several routes while avoiding repeating the same words over and over again -- here's an example from earlier this evening -- "Delays Pike westbound 93 to Star Market."

Controlling a computer is similar -- the context is pretty narrow and it involves concepts that you're going to bring up over and over again. You can quickly tap into this instinct to telescope the grammar to make it more efficient.

I think the best way to learn to control a computer using speech is to take away the keyboard -- put it someplace where it is inconvenient enough to reach that you can stand the inconvenience of learning speech without being tempted to hit keys.

If when learning to type you penciled in a letter here and there when the corresponding keys were hard to find, it would take you longer to learn to type. The same is true of controlling a computer using speech.

Once you've learned to do everything by voice, you can find the most efficient combinations by consciously choosing when you want to use the keyboard and when you want to use speech. What I've found is that unless you need to be working silently or need your voice for something else, most things are faster using speech.

This is partly because I use a lot of commands that are combinations of words. This is very like the juggling, and you need to approach learning them like juggling as well. I think one of the keys to becoming efficient quickly with speech is to use commands you can eventually combine.

One example from the basic command set in NatSpeak is getting used to saying "New Line" and "New Paragraph" as separate commands, than realizing you can say them after a word without pausing. People often instinctively figure this out.

Listening to yourself work is a good way to find where you need to combine things. If you hear yourself saying the same combination of commands over and over again, see if there's a way to combine them into single commands.

The built-in NaturallySpeaking macros that combine the most commands, and are therefore the most efficient, are the natural language commands in Word.

The best way to learn these is to identify the ones you want to use -- choose one way to say things from among the several that are offered -- and even go so far as to write the commands you've chosen down on a procedure sheet.

When you're learning a new language you have to hear a word something like 21 times before you'll be able to not only recognize but recall it. These commands are things you want to be able to recall without thinking or even hesitating, which is what you're used to on a keyboard.

Here's where the juggling really comes in. Commands that are available both in steps and as a larger command are easier to learn and remember in an instinctive, hardwired way. You'll find yourself saying single commands at first. Then you'll realize you can add two together; and then when it gets easier to add two together, you'll find yourself adding yet another.

This is also going to happen more smoothly if the commands follow similar conventions -- like a grammar. I'm continually reminded of this by my own behavior.

I regularly use the labeling feature of Eudora in the mailbox window to mark certain email messages with different colors. I also regularly change the font color of words in documents when I'm editing a story. It made sense when I wrote the Eudora macro to start off with the word "Label", and have the second word be a color. So when I want to label the mailbox entry blue, I say "Label Blue". Meanwhile, in Word, it made sense to write a macro that started off with the word Font: "Font Blue".

I found myself constantly saying the wrong command in the wrong program. So one day I added Font to the Eudora macro to make them consistent.

I left Label as well because that's what it says on the menu, and so if you've forgotten what the command is, it's important that you can figure out a way to say it by looking at the menu. But because Label is still a viable command, I tend to say a mix of Label and Font in Eudora, which means I still occasionally say "Label Blue" in Word.

I think this has something to do with how we come up with commands -- my first instinct is to make the action happen -- change the color of something, it's an extra step to notice what program I'm in and choose the right command.

Here's another example I haven't done anything about because the commands really are different -- I occasionally find myself saying "Send This Immediately" -- which is what I usually say when I am done writing a message in Eudora, instead of "Save As" -- the command I say when I am done writing a story in Word. Again, this happens because the actions are similar.

So the more consistent you can be across programs in terms of the commands you choose, the faster and easier it will be to think of the commands when you need them.

It also helps if the commands are fairly succinct in the first place, and also can be shortened further as they are added together.

I also want to make a distinction here between two types of macros. Strictly speaking, a custom macro should be something specialized enough that it doesn't make sense to put it in a general set of macros. And strictly speaking, built-in macros, or commands, should give you a good solid vocabulary that allows you to do many things across programs -- communicate well with the computer in general.

Because spoken machine control language is still in its infancy, many of the macros that people make are more general macros that fill in the gaps in the program -- this is all part of the process of building a language.

I'm going to demonstrate some examples from macros that I've made -- these are the types of macros that I think should be part of the broader speech vocabulary -- they are general and work across programs and so are not, in a functional way, custom macros.

There are several layers to many of these macros. The first layer is moving the mouse arrow. The second layer is clicking the mouse, and the third layer is doing whatever you were going to do once you got the arrow in position.

The mouse macros work off a pixel grid and go by 10-pixel increments. I use the 760 by 1024 display size -- Eric uses 600 by 800. So to place the mouse you drop the last digit and say the coordinates. On my screen the coordinates are 1 through 76 down, and 1 through 102 across, and so to place the cursor in the middle of my screen I would say 38 by 51. Eric's coordinates are 1 through 60 down, and 1 through 80 across, and so the middle of his screen is 30 by 40. To make saying the right coordinates much easier, we've put strips of paper around our screens, marked off the numbers in increments of 5. The best way to find where to put the marks is to use the macros.

Let me stress that making these paper markers is key to making this method work well. It works well enough that I rarely get the coordinates wrong when I am, for instance, clicking around the Internet.

You should be able to become fairly precise with this method relatively quickly. If you miss, just say the command over again with the correct coordinates rather than adjusting the mouse some other way. You'll learn to be more accurate with the coordinates more quickly this way.

The macros in the text of this talk were demonstrated during the talk. You can demonstrate on your computer the basic mouse macros and most of the others by downloading the KimMouse1 and KimMouse2 dvc files from my site and importing them to your user. I still use NatSpeak 5, and so have not tested any of my macros on NatSpeak 6 yet. I do know people who have successfully imported these mouse macros into version 6, however.

To demonstrate the Dreamweaver commands you will also need the KimDreamweaver.dvc file. To demonstrate the command that uses a type of voice clipboard ("15 Copy Line to 1"), you'll need to install the KimGlobal.dvc file and copy the "Store" folder to C:\My Documents. This is explained more fully in the macro download documentation posted on www.scriven.com. These should work smoothly if you use version 5. I'm not yet sure which KimGlobal macros work and which do not in version 6. I do plan to eventually fix those broken by changes in version 6, but I'm waiting for version 6 to become more stable before I make the switch.

Instead of saying the exact numbers in these macros, you'll probably want to find your own coordinates.

Moving the mouse without clicking:

"Mouse 25 by 47"
"Mouse 10 by 10"

Because it is much more common to want to both move and click the mouse, especially after you become fairly accurate with these commands, the moving the mouse and clicking commands are shorter and easier to say:

Moving the mouse and clicking in one command:

"52 by 50"
"60 by 27"

To make the command even shorter, you can also leave out the "by":

"52 50"
"60 27"

One advantage to moving the mouse this way is you don't have to find the mouse arrow in order to move it. Try using the mouse commands to click around the Internet:

"40 40" etc.

When you get relatively accurate with these commands, you can make things really get fast when you combine these commands with one or several more. Try the following commands in a Word document:

Selecting and deleting text:

"Select Word"
"Select Paragraph"
"Delete Paragraph"

Now combine them with the mouse commands:

Moving the mouse and selecting and deleting text:

"30 30 Word"
"40 40 5 Words"
"30 30 Paragraph"
"40 20 3 Paragraphs" (Note: this command requires that the Word natural language commands are turned on.)
"40 40 Delete Paragraph"
"40 40 Cut 3 Paragraphs"
"40 40 Bold Line"
"40 40 Bold 5 Words"

The following commands only work with the changes in the options.ini file to the global capitalization commands (changing "Cap That" to "Add Caps") explained in the macro documentation.

"40 40 Add Caps"
"40 40 Add No Caps"
"40 40 Add All Caps"

The following commands are global, but only work in programs that have a program-specific "Font <Color>" macro. The KimWord and KimDreamweaver macros include this.

"40 40 Green Paragraph"
"40 40 Blue Word"
"40 40 Blue Three Lines"

You can also combine moving the mouse with double and right clicks, and dragging. I use "Touch" in place of "Mouse Click" because it is shorter and easier to say. Try using the following commands to click on and move icons on your desktop.

Moving the mouse and double or right clicking:

"30 30 Touch 2"
"30 30 Touch Right"
"30 30 Right" (This is a shorter version of the right click command.)

"20 20 Drag Down 5"
"20 20 Drag Up 10"

See the full documentation of these global mouse macros posted at www.scriven.com/RSI/RSIdata/KimsMacros/Doc/Moving_the_Mouse.html
for many more examples of combined mouse commands. If you have the KimNetscape or KimIE macros and the Store folder installed, you can also say "Help On Mouse" in Netscape or Internet Explorer to bring up this file locally.

Also, for examples of combined commands that do not involve the mouse, take a look at the "pressing keys and multiple keys" commands at
www.scriven.com/RSI/RSIdata/KimsMacros/Doc /Pressing_Keys_and_Multiple_Keys.html
("Help on Multiple Keys"). These commands are a subset of the KimGlobal macros.

When you get used to working with a computer like this -- two or three or even four steps at once, you'll find that your computer seems considerably faster. This is because you're getting rid of a lot of those tiny little speech recognition lags -- it is faster for the computer to listen for and understand one command, then execute three actions at once.

Because keyed and mouse commands have become so instinctive, it sometimes seems like you're going very fast when you're clicking around a lot, but the same speed would seem slow if you were accomplishing things at the same rate when interacting with a person.

Watch someone using a mouse and keyboard and picture what you would say if you were telling a person to do the whole job and then if you were talking out each step like you have to do for the computer.

A good metaphor is walking across a room with a toddler, matching their smaller steps, versus striding across the room. Our natural pace in communications is striding, but we've become used to taking many more steps to accommodate the computer.

A somewhat extreme example is saying "can you pull up the budget file for last August?" versus clicking the file menu, clicking Open File, clicking on Open In, clicking on the C Drive, clicking on the Archive folder, clicking on the Budgets 2001 folder, then clicking on the August budget file.

The desktop interface does give you some shortcuts to things you do often -- the current budget would be just three clicks away in start menu documents if it is one of the last documents you accessed. But the speech interface, with its vastly larger command vocabulary, gives you the potential to execute many more actions in a way that is a whole lot closer to the first example than the second.

Here's another example. This is a set of macros that allows you to combine a couple of commands, then eventually combine combinations. The last command, which does the most, is a real custom command -- it works with specific files on our computers.

Except for the first global command, these are specific to Dreamweaver:
You must have the KimDreamweaver macros installed for the Dreamweaver-specific commands to work.

"30 40 Word"
"Make Link"
"30 40 Make Link"
"30 40 Link" (This is a more concise way to say the above command.)

The following two commands are custom commands that I use and are not included in my macros. They are a good example of how far you can go in combining commands, however. They take the previous command several steps further by finding the exact file to link to.

"30 40 Glossary Link C"
"30 40 Glossary C" (This is a more concise way to say the above command.)

Unfortunately, the speech interface is still in its infancy, and so the language that will do this is not readily available out of the box. You can get at least part way there if you work at it, however.

One way not to use these mouse macros is clicking through the menus of the programs you regularly use. Menus are like directories -- they're great in the first instance, but once you get to know what you want to do it becomes tedious to go through the menu instead of just saying a command. Menus are a good illustration of how much more efficient speech commands can be. Saying a first level command directly will increase your productivity by 100 percent.

In learning to control a computer efficiently with speech it's important to listen to yourself working. If it is difficult to listen as you are working, tape a session at the computer, then listen. You'll find you are saying the same combinations of things over and over again -- these are places where you can work on combining commands to speed things up.

If you use NaturallySpeaking, take a look at the natural language commands. Some of these combine commands. Consciously choose the vocabulary you want to use, then print out, or write out, or have someone make for you procedure sheets, and stick to them until you know them well.

Fortunately there is a free utility out there that makes it easy to print out lists of commands. David Austin's SayWhatPro allows you to see what you can say globally and in specific programs, and you can choose commands from a list, then print just those you have chosen. He's updated the utility for NatSpeak 6. You can download it from http://www.speechutilities.com/SayWhatPro/SayWhatPro.htm

Here are some Word natural language commands that will speed you up considerably. Using a mouse they take 2, 7, 7 and 6 clicks, respectively.

"Print Document"
"Print Pages 2 through 4"
"Make 5 by 7 Table"
"Make Paragraph Green"

For a larger list of Word natural language commands, take a look at the italicized commands that appear at www.scriven.com/RSI/RSIdata/KimsMacros/Doc/Word.html
(Or say "Help on Word" in a browser.)

I have some mild complaints about the natural language print and formatting commands and these extend to many of the built-in commands. Although they are very powerful, they're more difficult to learn and remember than they need to be.

The print commands are missing something. The first two commands are excellent, but there is no command that leaves you at the print window where you can make changes. Especially when you're learning these commands, it's nice to see what's happening without the document being suddenly printed. A very common macro addition to these is a simple "Print That" command that clicks on File and Print.

In the same vein, I tend to shorten the natural language commands -- it is easier to say "Paragraph Green", than "Make Paragraph Green."

Another thing to think about when you're choosing commands is how easy they are to pronounce.

A lot of users have changed "mouse click," "mouse double click," "mouse right click" for instance, to something else because those phrases are fairly long and because the word "click" especially, is very difficult to say. I use "Touch," "Touch 2" and "Touch Right", and use both "Touch Right," and the shorter "Right" in combined commands.

Another way to make controlling the computer easier is to be only as precise as you need to be.

The mouse placing commands I have demonstrated can be very precise -- actually they can be more precise than you've seen because there's another set that I almost never use that allows you to go pixel by pixel instead of using ten-pixel segments. What you do is add a decimal point.

These are global macros being used in Photoshop:

"40.4 by 60.3"
"40.5 by 60.3"

"Drag down .7"
"Drag down left 1.2"

But, when controlling the computer, I think it's best to be as imprecise as possible, because this is cognitively easier.

This concept is behind a set of mouse commands that only uses the first number -- often in a document you only really need to identify what you want to do by line.

Here are some examples:

"40 Line"
"40 Paragraph"

"40 Delete Line"
"40 Copy Line"
"40 Cut Paragraph"

"40 Enter 2"
"40 Break" (This macro puts a line break after the period.)
"40 Break 2 (This puts two lines after the period.)
"40 Join" (This joins two lines together)

"20 Through 45"
"15 Through 25"

Here's another example that is a fairly long combination of commands -- these allow you to copy text from a window into voice clipboard file in one command. For the clipboard macros to work you must have the KimsGlobal set of macros installed and the "Store" folder in your C:\My Documents directory.

Voice clipboard macros:

"15 Copy Line to 1"
"Open 1" (This opens file 1 so you can see what you have copied.)

"Copy All to 2"
"Open 2"
"15 10 Copy All to 2"
"Open 2"

"15 Copy Line to 1 Stay" (This leaves file one open.)

For a full explanation of the voice clipboard macros, which include files named 1-20 and Alpha-Zulu, see www.scriven.com/RSI/RSIdata/KimsMacros/Doc /Copying_Cutting_and_Pasting.html
(Or say "Help on Copying" in a browser.)

A related concept is making the target large. You can get pretty good at nailing even tiny little boxes with these commands, but it's faster and easier to work with a larger target, and you can in many situations. When you're working in a document the finest grid you really care about is what line down you are on and what word across. In the next example, you want the cursor to be at the beginning of a word, but you don't have to be that precise in aiming because the command puts you to the left of the nearest word.

"40 45 Word"
"40 45 Before"
"40 45 After"
"40 45 Word Apostrophe S"

The large target concept is also behind commands like "40 Line", "40 Paragraph" and "40 40 Paragraph"

And so the keys to making controlling the computer easier are to be as concise as possible and as imprecise as possible.

A third thing to think about is context.

Sometimes different commands are more or less easy depending on what you're doing. You can identify which types of macros are best to use with which tasks by paying attention to exactly how cognitively difficult certain macros are.

When something is hardwired -- reaching for the escape key, or saying "Go Home" or "Font Blue", once have gotten used to it -- it is a much easier task than saying something new. This is why the select and say commands, while extremely useful, can be somewhat tiring. You are finding a word, identifying it and pronouncing it. We do this all the time so it's no big deal, but it is harder than doing something by rote using hardwired commands.

I use the select and say commands all the time. And I find that they are easiest to use when I'm not thinking about something else. For instance, when I'm making changes to a story I have already edited on paper.

I also use the select and say commands when I'm writing, but not all the time. There are times when it is cognitively easier to use a different kind of command -- like "Select Paragraph" -- rather than identifying the first and last words of the paragraph and using a select and say command. I suspect this is because the part of my brain that identifies words is used in both composing words and using the select and say commands.

This is a subtle point, but you can make things easier if you notice in what situations certain commands are easier or harder to say and choose which types you want to make instinctive in which situations.

In addition, although often a select and say command is the most efficient way to do something, sometimes you can be even more efficient. Commands like "40 Cut Line" and "30 30 Delete 3 Paragraphs" take two commands using select and say commands.

By the same token, it is often not necessary to move the mouse. It's much better to say a menu command directly than waste three commands clicking through windows. Again, you can find many of these types of commands in the natural language command set. Here's another example. Although you could use the mouse coordinate, drag commands ("40 40 Drag Down 10") to move and size windows, here's a better way:

Moving Windows:

"Window Down 10"
"Window Right 20"

Sizing Windows:

"Size Window Bigger 5"
"Size Window Smaller 10"

Something else that can make things a little easier is memorizing the Alpha-Zulu words that represent the letters. Even though you can spell using letters in the correction box, being able to spell in context will occasionally speed you up. One example is saying URLs and email addresses. You can say, for example, "person at alpha whiskey charlie delta dot university dot edu" without pausing, to get this address: person@awcd.university.edu.

The Alpha-Zulu words also make some other macros more accurate. Here are some more examples of combination mouse macros, and also some file opening macros that illustrate that point. The first two sets are global macros that can be used in any program. The third set is also a good example of a long combination -- it opens a program and leaves you close to the file you want in one command:

Word Open File Window:

"30 45 Charlie"
"30 45 India"

Word:

"Open File Alpha"
"Open File Tango"

Starting with Word Closed:
You'll need to have the Kim Global macros installed for these to work.

"Word Open Echo"
"Word Open Charlie"

If you have a speech recognition program like the professional version of NatSpeak that allows you use other people's macros, take a look at other people's macros with an eye toward finding commands that speed up the specific work you do.

If you have the other versions, take a good look at the built-in macros, and choose the combinations that make the most sense to you.

Learn the commands by arranging them in a directory in an order that make sense, and using them until they are instinct. Use an old-fashioned highlighter to mark the commands you think you will use more often.

You will eventually get much faster than your colleagues who are using their hands to communicate with the computer, and you'll feel more like you're walking at your own pace rather than trying to match the frenetic, tiny steps of the mouse.

It's also good to remember that as computers get more mobile, the need to be conversant in a spoken command and control language will become more and more common, and learning it now will just put you ahead of the curve.

Here are a bunch more mouse macros -- these are all global commands that can be used in any program:

The first command allows you to cut and paste to burn a CD. When I burn a CD I use a single command instead -- I made a program-specific command to carry out the three actions in a row. Using the mouse commands, however, makes it fairly fast to do things like this in any program the first time you use it.

Combining Mouse Commands with Copying and Pasting:
The demonstration showed these commands using the Create CD program. It is useful in any program or group of programs where you have to copy the contents of one window into another.

"25 35 Copy"
"50 35 Paste"

Try these in Word:
"40 30 Three Words"
"40 30 Bold Five"
"40 30 Bold Line"
"40 30 Paragraph Green"
"40 Sentence Paste" (This and the next command paste after the nearest period.)
"40 30 Sentence Paste"

Try this one in the Word Open File Window:
"25 37 Enter"

Here are some further thoughts on speeding up speech:

Using speech also means rethinking your defaults -- things you've gotten used to doing using a keyboard because they were the most efficient way to carry out commands using your hands. Here are some of the shifts I've made:

I more often use the Find window to select words from a whole document, especially when I'm making changes that are written on a paper copy. ("Please Find", Say Words to Find, "Find Close". These commands are in both KimDreamweaver and KimWord)

My mouse cursor is set to snap to the default choice, so I can just say "Touch" to get the default choice. (To change the setting, go to Mouse in the Control Panel, and check the "Snap to Default" box under the Motion tab.)

It's useful when macros indicate when an action is done. It is worth it to use the longer "Save As" and "Enter" instead of just the save command so you can see that your document is really saved. The save command without the enter is also fairly dangerous. If it trips accidentally you cannot take it back by saying "undo that" like you can with most commands.

Along the same lines, there are a couple of dangerous built-in commands that I've deleted -- one example is the single word command "Delete".

When I am writing, I more often use undo commands. ("Undo 5 Times", "Redo 7 Times" in KimGlobal.)

I make changes while I'm writing differently using speech -- it's usually easier to correct by selecting a few words and saying them over than correcting something letter-by-letter, and in a phrase where I want to make a couple of modifications in different places, I often just select the whole thing and say it over.

You can say "No Caps" before one-word commands like "Backspace" and menu commands like "Window" to get real words rather than commands. You can also accomplish this by speaking the word as part of a phrase. The word "Select" presents a different challenge. To get the real word, say in the middle or end of a phrase or by itself.

If you use the mouse macros to find coordinates to use in the SetMousePosition scripting command, make sure to flip the numbers around -- the mouse macros use the Y coordinate first. And also remember to add a zero on the end of the numbers, because the mouse macros are by 10-pixel increments.


Please send comments and suggestions to kpatch@scriven.com.

Speech recognition resources

 

[home] [writing] [about scriven] [rsi] [contact]

© Copyright 1999-2002 Kimberly Patch & Eric Smalley. All rights reserved.