Welcome to Atalasoft Community Sign in | Join | Help

I'm sure Joel Spolsky has forgotten more about UI design than I'll ever know, but I'll take a shot any way. A few days ago he wrote this about disabling menu items:

Users see the disabled menu item that they want to click on, and are left entirely without a clue of what they are supposed to do to get the menu item to work.

Instead, leave the menu item enabled.

I actually got to this post through this thoughtful reply from Daniel Jalkut about why disabling menu items is usable:

Disabled menu items convey valuable information. Users who are skimming menus in order to figure out what to do are trained by years of experience to skim past disabled items and look for enabled ones instead. The more complex the application is, the more valuable this dichotomy becomes. In essence, disabling menu items gives application designers a means of “funneling” user attention to the actions in an application that will actually work at this moment in time.

And, he further goes on to make the distinction between usable and learnable interfaces. There are probably a lot of ways to think about it, but basically I reserve "usability" for productivity related issues that users who use a system a lot run into and "learnability" for first-time or occasional users. Daniel makes this point:

The point is to build a framework for application learnability that does not seriously affect the usability of the application for experienced users.

Generally, I agree. However, some applications never have experienced users (meaning users that use the application nearly every day for long periods of time). For me, the applications that need to have high user-productivity (usability) are Office, Visual Studio, my browsers, and a handful of websites. For others, I appreciate some help orienting me, and I wouldn't mind non-disabled menus that explain why they can't work. If your application is used only occasionally or mostly by first-timers, then concentrate on learnability and don't disable menu items.

It's not always that simple. For example, I have to use Photoshop and Gimp at my job, but not as a designer and only occasionally. Most people who use it for work probably keep it up all day and might find it to be very usable. For me, its steep learnability curve makes it hard to use. However, I think it's right for the authors of those applications to ignore me. They should disable menu items that cannot be used in the current context because they have a large number of users that use the application very frequently.

There is a third option though, one that I think gets both usability and learnability. Just do the operation of the menu. If the user is picking the menu item, then they think it makes sense -- figure out what they think the menu item should do and then do it, or as Joel once put it:

Thus, the cardinal axiom of all user interface design:

A user interface is well-designed when the program behaves exactly how the user thought it would.

While reading the blog articles cited above, I quickly checked the menu of my browser (IE 7.0) to see which items were disabled (image on the right). I don't know what "Security Report" and "International Website Address" do, and the status bar message for them gives no indication when these might be available. I even did a quick look in the help and Google, but I didn't find anything useful.

Of course, the help could be better, but really, I bet that you could generate a "Security Report" for any web page.  Maybe the report would be simple or not give much information, but I would like to see what a report might have. This is not the same as just popping up an error, which would not generate a report. If you wanted to have some explanation of why the it was so sparse, that's easily integrated into the report.

Another option is to not put the items into the menu to begin with. If these items are so rarely available, then they probably shouldn't even be there. IE already uses the status bar for website/context specific actions, and these could probably be relagated to a less obvious place.

So, I don't think it's a clear as never/always disable menu items. Personally, I try to always make the menu enabled and just do the operation (undo is essential if you do this pervasively), but I would certainly disable a menu item that could not possibly make sense (e.g. "Save" when no document is open or "Copy" when nothing is selected). If I was writing an application that was only used occasionally, then I would opt for Joel's suggestion of leaving all menu items enabled and using them as a way to explain the usage of the application.

Last March, I blogged about a speech David Pogue gave at AIIM.

David Pogue gave a talk on the power of simplicity. His recommendations: pre-sweat the details, count the taps (like Palm), and that simplicity sells. He gave iPod, iPhone, Tivo, and Wii as examples.  On that note, if you're at AIIM, come by our booth to win a super-simple Wii (and also see how simple it is to annotate documents on the web).

Looks like it was based on this presentation to TED in 2006. It's an excellent presentation and worth seeing both for his style and the content.

I started my first Google App Engine application this weekend. I wanted to jot down some initial thoughts and tips.

1. My experience with it was greatly enhanced by the fact that I have built web-sites with Python and Django, have some experience with other WSGI based web-frameworks.  Given that, Google's API just "makes sense"--meaning, it's what I expected. Anyway, aside from knowing the language, I already have my machine set up for developing this way -- and I have a basic sense of how to structure applications with API's like this.  I'm not sure what it'd be like if you were coming from just ASP.NET, since it's quite different.

2. If you are building the app on a Mac, I can't recommend the Google App Launcher highly enough. You can do everything it does with the command-line interface, but this makes it a lot easier.

3. Somewhat surprised that GQL doesn't support joins. Google claims that joins make it hard to distribute the database.

One big feature that you may have noticed that our Datastore doesn't have, though, is joins. The reason for this is that joins are usually a source of performance problems in a distributed system, when you go beyond a single machine: it's much harder to efficiently support a join on a distributed system that spans many computers and many hard disks.

That actually underscores the strength of the Datastore, and why what we are doing is exciting. The Datastore is built on Google infrastructure like GFS, the Google Distributed Filesystem, and Bigtable, our horizontally scalable distributed storage layer. You can read papers about both GFS and Bigtable online, if you'd like-- they're fascinating. The long and the short of it, though, is that Bigtable is a storage system that can be used to support queries like ours, but which doesn't run on a single computer, or even a sharded set of computers, like a SQL database does. Rather, it is a fault-tolerant, distributed system that can span tens of thousands of hard disks on thousands of machines, making all of them appear to be a single storage table. It moves your data around and restructures the system automatically to account for hotspots and increased storage.

What it all means, is that if you use the Datastore correctly, making efficient queries and thinking ahead a little bit about how your app will grow, the datastore will make it easy to scale your application as it grows, from a few entities to millions. No need to shard your databases, or to restructure your schema. This is part of how App Engine makes it easy to scale your web application from the ground up. 

I have to trust them on this, but I still have basically relational data. I feel like I'm going to basically re-implement joins if I don't find some guidance from Google on how to structure data. I really like their way of thinking of data (basically glorified stored associative arrays) because it's easy to imagine how to do iterative development with it. I have experimented with this in the past and the problems come later when you try to make it perform -- but with Google muscle behind it, maybe I don't have to worry about that.

4. Some easy way to handle payment processing is the obvious missing API (aside from things they aren't trying to do at all).

5. Really psyched that Memcached is there out of the box, also, their simple Imaging API is actually pretty useful for simple websites.

So, Saturday, May 31st was the last day of 31 apps in 31 days.  Go over to the gallery page where you can find applications that use DotImage to create images, edit them or interact with online image services like Flickr, Scribd, and Amazon. There are web-based apps, desktop GUI apps, and command-line utilities, and the full source is provided for all of them.

 

Today, we launched 31 Apps in 31 days, where the engineers at Atalasoft are going to deliver one application made with DotImage each day in May (including weekends).

Steve is first up with a Motivational Poster generator. At Atalasoft, if you break the build, you own the build chicken until you fix it. The transfer of the chicken is announced with a blood-curdling chicken scream.


 

I feel motivated already (mmmmm..... Dropped caps).

I previously showed how easy it is to generate Code 39 Barcodes using DotImage, now I want to show you how to do it in pure JavaScript.

I wanted to make it easy to deploy so I don't use images. Luckily, 1D barcodes are easy to make with just colored table cells, so I just generate a one row table with a column for each bar.  I set the width of the column to be the width of the bar and then color it either black or white.

Here's a page where you generate barcodes with just JavaScript. The code is attached to this entry.

To use it, just add Barcode.js to your site, include it in a script tag, then use the AtalasoftBarcode39 object to draw the barcode (there is a file called BarcodeTest.html in the zip file to show an example).  Here's some sample code

<script src="Barcode.js" type="text/javascript"></script>
<script type="text/javascript">
function writeBC(s)
{
    var bc = new AtalasoftBarcode39(s);
    document.getElementById('bcArea').innerHTML =
        bc.getBarcode(50, 2, 8);
}

writeBC("ABC123")
</script>

<div id="bcArea"></div> 

The three arguments to getBarcode are the height and the widths of the narrow bars, and the width of the wide bars.

 

3 Comments

Attachment(s): Barcode.zip

I just noticed this blog entry from picturetom pointing to an interesting study on trends in digital photography available for purchase from InfoTrends.

One of our customers, David Cardinal from ProShooters, fits this bill:

‘Today´s photographers are technologically savvy, as a significant majority is using image editing software and many are using RAW conversion and color management software’, commented Ed Lee, Director at InfoTrends.

‘It´s particularly interesting to note that 83% of professional photographers are using the Web as part of their business and about 30% use an online photo service provider. This suggests that a variety of Web services providers could see future growth opportunities.’

David is a photographer who can code and about a year ago wrote a review on choosing an imaging toolkit (ours) for Dr. Dobb's.

At Pro Shooters (www.proshooters.com), our primary product is DigitalPro for Windows, an image-management system for digital photographers. We sell to serious photographers who have large image libraries and often need to process hundreds or even thousands of images at a time. As a result, we serve a niche market and can only afford a small programming team. That means we need to be more productive than the competition to stay ahead. So, instead of trying to do everything ourselves, we rely on tools such as third-party libraries in areas where we can find excellent tools that don't compromise on quality and features.

David also pushes us to continually expand our RAW support (mostly supporting more and more cameras) -- DotImage 6.0 and the upcoming 6.0b both have support for more cameras than previous releases.

1 Comments
Filed under:

OCR Engines are very good at reading text from clean machine print documents. If you have older scans or if the documents are not meant to be easily read by a machine, there are still some things you can do to improve your accuracy.

This report from the GPO on Optimizing OCR Accuracy made some interesting findings, but also some errors which I will try to explain.

The report cites that thresholding the documents didn't improve accuracy and shows this result:

The issue is not that thresholding wouldn't help. In fact, since most OCR engines can only work on thresholded documents, they will do it for you if you do not. They are right to point out that the scans should be done at full color -- but that's because you then get a chance to apply the thresholding yourself (instead of letting the scanner do it). If you use a good thresholding algorithm, you can do quite a bit better.

Using DynamicThresholdCommand from DotImage Document Imaging and SpeckRemovalCommand from Advanced Document Cleanup with default parameters, I got this result:

I don't have the original, so I cannot check OCR accuracy, but I bet I will get a better result than they found using the default threshold in Photoshop. In any case, a threshold must be done before OCR, so either you do it under your control or the OCR engine will do it for you.

Another problem they found is with downsampling. They had scans at high DPI, but the OCR vendor recommendation was for 300 DPI so they downsampled. I am sure that the OCR vendor meant at least 300 DPI, and they did not have to do this. It is sure that you will reduce OCR accuracy with downsampling as you have to lose information in downsampling. Even if you do apply it, you must make sure to choose a good algorithm -- there are benefits to downsampling (increased speed), but if accuracy is the main concern, then you should not do it.

The use of image processing before OCR can increase accuracy, but you must use the proper algorithm. A limiting factor of their tests is that applied their pre-preprocessing steps manually with Photoshop and therefore could not try a lot of different options. By using an image processing toolkit, you can easily run a lot of tests in batch. You are essentially solving an optimization problem, so applying a hill-climbing or genetic algorithm would help decide the best processing choice for your collection of documents.

(Full Disclosure: I was a tech reviewer for this book and received a free copy)

ADO.NET 3.5 Cookbook

I read the ADO.NET 3.5 Cookbook last November coincidentally while I was writing automatic image fetching from SQL Server and OleDB databases into DotImage 6.0 (DbImageSoruce).

I've been using the various incarnations of Microsoft data access technologies for quite some time and have been using ADO.NET for a few years, so I wondered whether I was going to learn anything new from this book. It covers all of the territory to get started (connection strings, basic usage of ADO.NET classes, etc.), but what I really appreciated was that it topics that advanced ADO.NET users would find useful and I certainly learned a few new tricks.

The topic on writing provider and database independent code (Section 10.22) which covers how to do it right if you are targeting .NET 1.1 (which we do) was particularly useful to me. Chapter 10 (Optimizing .NET Data Access) is just generally a good chapter no matter what your level and covers asynchronous SQL calls (executing and cancelling), ASP.NET data caching, paging queries, SQL Server stored procedure debugging and more.

The other thing I liked was the general best practices advice that was sprinkled into the recipes in appropriate places. If you are new to writing DB code, read the "Storing Connection Strings" section (1.1) carefully so that you do it correctly. Bill not only explains how, but why. And since this is the first recipe, it sets the tone for the rest of the book as being practical ("here's the code"), but also gives you the background to understand it.

Since my job was to actually run every code snippet, I can vouch for their quality. Most are built off the AdventureWorks sample database that comes with SQL Server Express, so they are ready to run. The rest come with full DDL to create what you need (databases, stored procedures, etc), and the code and SQL is available online so you don't have to type it in.

And, of course, since it's updated for ADO.NET 3.5, it includes information on LINQ and SQL Server 2008. 

If you are in the Springfield, Massachusetts area, you might want to check out the RTC. Today, I attended an event that they put together, called Breaking Through: IT-Enabled Business Opportunities. The next one builds on some of the ideas in it, and is on Social Networks (May 2nd).

Paul Gillin gave the keynote about the new influencers (bloggers, online-communities and social networks). He offered some examples as inspiration for trying to use these phenomenons as an alternative to mass-media and traditional advertising based campaigns (Will it Blend). He also gave some examples of what the new influencers can do if you get things wrong (AOL Hell), and what you can do if it happens to you (respond quickly).

The panel brought together some IT executives to talk about key IT trends. On the ECM Imaging front, many of the participants mentioned Document Imaging as a key initiative. Drivers for that included increasing productivity, cutting costs (and paper), collaboration and a key issue was searchability of the resulting repository.

Next month, Atalasoft is going to deliver 31 Apps in 31 Days (one for each day in May). We are inviting anyone outside of Atalasoft that wants to participate to submit applications. We're looking for small, useful desktop applications that use DotImage and that you are willing to distribute for free (you can have an upgrade to a pay version).

Applications submitted by non-Atalasoft employees are eligible for a prize. If you want to join in, go the to the 31 Apps sign up page and we'll tell you how to submit your entry.

So, Rick, Jacob, Adam and Elaine have all given their impressions. I guess I should chime in.

The thing that struck me about Code Camp is that with a couple of people organizing, some sponsors and volunteer speakers, you can put on a technical conference that just blows away what's out there. Sure, the food could be better (how would you feed 500+ people on a budget? Pizza!), and everyone sure would want access to wifi, but here's what I think Code Camp gets right.

1. Free. For someone who has to manage a budget, this makes the decision easy. We put up five developers in a hotel in Boston and took them out to dinner, and it still cost less than sending just one developer to a standard conference.

2. Hardly any vendor presentations (if any). The Code Camp manifesto makes it very hard (all code must be free). I gave a presentation on writing cmdlets in Powershell.  To do that, I needed some interesting .NET objects to wrap, so I chose DotImage -- however, I don't think anyone would think I was hawking DotImage -- but, just to make sure, we gave away free copies of DotImage Photo so that all of the people that came to the talk can run my code. (fully functioning copies, not evals). Ok, I guess showing MS tools and API's are vendor presentations, but most of that is available for free too.

Vendors need to be very careful about just giving demos at conferences -- look at what Bruce Eckels had to say about the last Pycon:

I believe that this year's Pycon organizers suffered from inexperience and naivete, because they didn't know that some vendors will ask for anything just to see how far they can push it. And that it's a negotiation, that you must push back rather than give in just because the conference might get some money for it. More importantly, that the imperative to grow Pycon does not mean "at all costs." I've already spoken to more than one vendor who was dismayed by the state of things, so we are not talking about all vendors here by any means.

At first the morning plenary sessions -- where the entire conference audience was in a single room -- just seemed a bit commercial. But then I slowly figured out that the so-called "diamond keynotes" were actually sold to vendors. It must have sounded great to some vendors: you get to pitch to everyone and nothing else is going on so the audience is trapped.

From what I saw at AjaxWorld last year and heard from this year's -- they are also leaning more towards vendor presentations. Code Camp presentations (even by vendors or authors selling books) were full of useful content -- I never felt like I was being sold to.

3. Very developer centric -- all attendees were invited to give presentations -- this is the perfect place to get some practice giving talks. Most presentations showed code and built something in the time allotted. I saw a game built in XNA, an IronPyton extension added to a .NET application, a bunch of useful HTTP Handlers and modules built from scratch, and a lot more.

4. Access to experts -- I have pages of notes for ideas for how to improve our product and process at Atalasoft. Michael Cummings told me about InternalsVisibleTo, which alone was worth spending the weekend in Waltham. I got a chance to talk to Edwin Ames after seeing his presentation on TDD, Behavior Driven Development and mocking frameworks, and he told me more about NMock2 and Rhino. I got some info on FileSystem fields in SQL Server 2008 from Matthew Roche.

Ok, here's how you create a CmdLet that uses DotImage:

1 Download the attached zip open the solution

2. Build it -- my dll is named AtalasoftSnapin.dll

Here is the code for GetImage.cs, the implementation of the get-image cmdlet

using System;

using System.ComponentModel;

using System.Management.Automation;

 

using Atalasoft.Imaging;

 

namespace Atalasoft.Commands

{

      // the name of the cmdlet is get-image

      [Cmdlet("Get", "Image")]

      public class GetImageCommand : Cmdlet

      {

            private string _filename;

           

            // it has one parameter and it is mandatory

            // and can be piped in

            [Parameter(Mandatory=true, Position=0,

              ValueFromPipeline=true)]

            public string Filename

            {

                  get{

                        return _filename;

                  }

                 

                  set {

                        _filename = value;

                  }

            }

           

            protected override void BeginProcessing()

            {

            }

           

            // take the string they pass, treat it as

            // a filename, and load it into an image

            protected override void ProcessRecord()

            {

                  WriteObject(

                   new AtalaImage(_filename));     

            }

           

            protected override void EndProcessing()

            {

            }

      }

}

3. Install it with these lines in Powershell

C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\installutil AtalasoftSnapin.dll

add-pssnapin AtalasoftSnapin

4. Now in Powershell, you can type

> get-image "full path to image"
> dir *.jpg | get-image

After adding in the cmdlets for gray-image and combinechannels-image, you can now create the red-blue 3D images I showed.

1. Take two pictures, one with your left eye, and one with your right
2. Make them both gray:

> get-image "path to image" | gray-image | save-image "output image"

3. And now put the left image in the red channel, and the right image in the blue and green.

> combinechannels-image -r (get-image "left image") -g (get-image "right image") -b (get-image "right image") | save-image "3d image"

replace "left image" and "right image" with the names of your gray images. replace "3d image" with the name of your output image

You might think that the whole point of the presentation was to get this picture (click for larger version):


But, it's not -- the point of the presentation is to get this picture:

 

A solution with my code is attached below. 

Ok, you're going to want to install Powershell before reading the rest of this. Once, you've done that, try typing these commands at the prompt (type in what appears after the >, the text in italics is just commentary):

Here's the hello world program
> "hello world"
Remember, the result of your expression is just printed by the shell. "hello world" is a .NET System.String

Here's some other primitives and simple expressions:
> 3
> 3+4
> 1GB
> "hello " + "world"
> "hello world".substring(6)
> 1..10

Remember that Powershell's directory primitives operate on a provider interface.  You can call dir and cd on anything that implements it (like the registry)
> cd HKLM:\
> dir
> cd SOFTWARE\Microsoft\Windows
> dir
> c:
> dir

There is extensive built-in help:
> get-help get-process
> get-command *process*
> "hello" | get-member
> get-history
> get-command *history*

You can call .NET assemblies (variables start with $)
> $feed = [xml](new-object System.Net.WebClient).DownloadString(
"http://www.atalasoft.com/cs/blogs/loufranco/rss.aspx"
)
> $feed.rss.channel.item | select title
> $feed.rss.channel.item | ? { $_.title -match "ower" }

And you can call any third-party assemblies (like DotImage).  Once you have installed DotImage, copy Atalasoft.Shared.dll, Atalasoft.DotImage.dll and Atalasoft.DotImage.Lib.dll to C:\WINDOWS\system32\windowspowershell\v1.0 -- this is one of the places Powershell looks for assemblies. You load them like this:

> [System.Reflection.Assembly]::Load("atalasoft.dotimage")
> [System.Reflection.Assembly]::Load("atalasoft.dotimage.lib")
> [System.Reflection.Assembly]::Load("atalasoft.shared")
 

In all of the next examples, you'll want to give full paths to filenames -- Powershell has an odd notion of the current directory which is confusing to explain right now.

Here's how you load an image
> $img = new-object Atalasoft.Imaging.AtalaImage("fullpath to image")

And now run the oilpaint command
> $cmd = new-object Atalasoft.Imaging.ImageProcessing.Effects.OilPaintCommand
> $cmd.BrushWidth = 15
> $img2 = $cmd.Apply($img).Image

And save it (any undefined variable is null, so don't define $null)
> $img2.Save("full path to image", (new-object Atalasoft.Imaging.Codec.JpegEncoder(85)), $null)

That turns this

 

into this


Tomorrow I'll post the code that makes it easier to do this (using Cmdlets) 

Update: Here is Image Processing with Powershell, Part II

This is a placeholder blog entry where I'll post my slides and code for my Extending Powershell talk at Code Camp on Saturday.

The code I am showing uses DotImage, and we want everyone to be able to play with it, so we'll have free DotImage Photo licenses for everyone who comes to the talk (and some schwag to raffle off).

UPDATE: If you are coming here from the link we gave you at the show -- here is Part I from the Powershell presentation.

0 Comments
Filed under:
More Posts Next page »