Welcome to Atalasoft Community Sign in | Join | Help

Here’s a good article in Computerworld for determining the true cost of SharePoint. Key message, it’s just like any other enterprise software, don’t treat it like off-the-shelf software:

If CIOs treat SharePoint as off-the-shelf software, the costs will indeed be onerous. However, if CIOs treat it as an enterprise information platform and content management system, SharePoint will yield tremendous value-and potentially at a fraction of the cost of comparable ECM solutions.

Absolutely right. SharePoint can be managed and added on to by end-users, but there should be some sort of central control to set standards and provide common customizations.

For example:

  • If you care about branding SharePoint, then that is something that should be done by IT or consultants for the entire installation.
  • SharePoint comes with some built in workflows, but new ones need to be created in code or SharePoint designer – best to have IT provide the ones you need.
  • Obviously, if you have any kind of complex storage requirements, IT is going to have to deal with that.
  • IT should find and install 3rd party components that meet common needs. For example, if you have a bunch of departments that each need to incorporate paper-based processes via scanning, viewing, and collaborative editing then you want to standardize on a document imaging SharePoint solution.
  • The Computerworld article makes a good point about governance. Depending on your requirements, you may need more than just a written policy, and it might be important to build or buy a solution to enforce your rules.

Given a good foundation, users will be able to create a lot of functionality on their own, and they will be able to think about it like they think of other Office apps (like Excel), but there are some things that IT will have to set up to make this possible.

0 Comments
Filed under: ,

While Atalasoft in Boston (at SPTechCon) this week, showing our Document Viewer for SharePoint, there was more SharePoint news across town at the Enterprise 2.0 Conference. Fierce Content Management is reporting on Christian Finn’s comments about SharePoint.

When it comes to document management, Finn [Microsoft’s Director of SharePoint] says they might not have as comprehensive an offering as say EMC Documentum, but what his company does is provide an easy path for people to manage documents. You just right-click, access the document management features and you're on your way. He's not concerned that people consider it a "light weight" or "good enough" offering. "We are not as good as best of breed," he says "but we're OK." He adds that they provide integration to the more comprehensive content management solutions for organizations that require that.

He says ultimately the numbers speak for themselves. "It's a question of positioning," he says. "We have 100 million licenses and 17 thousand enterprise customers. Any vendor can say SharePoint does a lot of things, but not well, but customers are voting with their dollars to deploy it."

With CMIS, and the fact that they are a platform, Microsoft doesn’t need to offer the best product out of the box – they have the easiest to extend ECM platform, and that’s how they will win.

Over on Digital Landfill, the AIIM blog, John Mancini is publishing submissions on “8 Things you need to know about [some ECM topic]”. Experts from all over the AIIMiverse are submitting lists on their area of expertise – definitely worth a read (and look out for my suggestions on reducing hardware costs with image processing).

Jeff Shuey thinks Document Capture in SharePoint is up for grabs:

Companies that had a chance to own this space have faltered. The leading vendor of document capture for 20+ years is Kofax. They have fallen flat and cannot seem to get out of their own way to make anything significant happen. I’m sure there are wins in various places across the globe where Kofax products are being used to address the document capture needs of companies. My point is that Kofax as a company has not made the right moves to go after what I can see is easily a $3-5 million dollar opportunity in the first year and has the potential to be a $1B dollar business.

Jeff used to work at Kofax, and he knows them a lot better than I do – so I’ll let his analysis stand.

Jeff then goes on to mention a few companies that are entering this space, including (ahem), Atalasoft (with Vizit).

Atalasoft is the leading provider of .NET imaging SDKs, which includes our DotTwain product for capture. We’re also a member of the TWAIN working group, creators of InspectorTWAIN (TWAIN driver grading service), and with Vizit SP, we’re bringing our imaging expertise to bear in SharePoint. Another company on Jeff’s list, BlueThread, announced a partership with us at AIIM in March:

Atalasoft’s Vizit SP, a zero-footprint no-download document image viewer with cleanup, editing, and annotation features, is now fully integrated within BlueThread’s SmartDesk, a highly-configurable end-user ECM application framework. With Vizit SP and the advanced ECM/BPM capabilities of SmartDesk, users can readily view process-enabled content stored within Office SharePoint Server 2007 no matter the document type (Office, PDF, TIFF, CAD, etc.) within a completely web-based SmartDesk interface.

At the end of his blog, Jeff asks if there really is a Capture gap in SharePoint – clearly, we think so – and remember, capture is more than just scanning – it’s everything that goes into getting the document and metadata into your system. That includes barcode reading, OCR, indexing (including human assisted), capture QA workflows, and document clean-up. Each of these represents an opportunity for partners to work with Microsoft to make SharePoint an ideal capture system.

0 Comments
Filed under: , ,

Lars Fastrup has a round-up of the new features of SharePoint 2010. Among them is that CMIS will be supported. I’ve already blogged that Microsoft released a sample project for consuming CMIS in a document library, which has been possible for a while, so I assume that CMIS support must mean that we’ll be able to interact with SharePoint documents via CMIS.

There has been some wondering whether Microsoft could support CMIS without rearchitecting. We’ll see how it looks when it is released, but there is nothing in CMIS that Microsoft couldn’t implement with the current object model, and they have access to the underlying data model, so I have never understood what the issue is. Also, since they are one of the companies driving CMIS, it isn’t likely that they would approve of an interface that they didn’t think that they could implement effectively.

It will be interesting to see if CMIS support will be extended to the Office desktop applications. Right now, Word, Excel, etc, communicate to SharePoint via a proprietary interface, when CMIS would work just as well. If they supported CMIS then they would be able to connect to Alfresco (which already supports CMIS) and other ECM systems that will probably have CMIS by 2010 as well. Of course, there is nothing stopping those systems from implementing Microsoft’s current interface (as Alfresco and Atalasoft have done), but this would be an indication that CMIS is not just for external systems, but Microsoft is willing to build on it as well.

1 Comments
Filed under: , ,

It’s a cliché that nobody ever wanted a drill, they wanted a hole, but the same is true for search – nobody wants to search -- they want to find. To that end, here are some ways to make content easier to find.

If you make content:

  1. Make something worth finding
  2. Name it well
  3. Put it where people will look for it
  4. Use the jargon of the domain, but also use the words that a layperson would use
  5. Know the rules that search engines use to determine what words are important and follow them
  6. Add meta-data
  7. Create for an audience, and then tell them about it
  8. Turn non-text content into text and add it as meta-data or related content (use OCR for images, lyrics for songs, transcripts for videos, descriptions, etc)
  9. Spell everything correctly, but put common misspellings of keywords somewhere in the content

If you find something that you were looking for:

  1. Remember how you first started looking for it (keywords and location)
  2. Create more content with links to it using your keywords
  3. Add meta-data (add comments, ratings, and tags to content)
  4. Copy or create a shortcut in the place you originally looked for it
  5. Tell people about it in ways that can be searched (e-mails, tweets, IM, Reddit/Digg/StumbleUpon/Del.icio.us, etc.)
  6. Give the search engine feedback if it allows that

If you make a search engine:

  1. Sort the results so that the “best” content is first
  2. Tell content makers how you define “best”
  3. Penalize those that try to game the system and adapt to their behavior
  4. Show parts of the content that help searchers figure out if this is what they are looking for
  5. Let searchers tell you when you have done a good job
  6. Don’t force content makers to do unnatural things to their content for it to be found

If you make content management systems:

  1. Allow anyone to add more meta-data to content (comments, ratings, tags, etc)
  2. Let content be in more than one place without duplicating it
  3. Make the taxonomy of content easy to create and navigate
  4. Automatically add as much meta-data as you can
0 Comments
Filed under: ,

Wolfram Alpha is built on a curated, proprietary database created over several years by 100+ Wolfram employees. I don’t think that the details of this database have been released, but it’s clear that it isn’t something that was created by web-spidering, parsing, and organizing algorithms that search engines typically use. That’s too bad, because when I first read about Alpha, I was hoping that I’d be able to apply it my own content. Standard search engines are really bad at desktop and enterprise search, and I wanted to try something radically different.

The genius of Google’s page rank when searching the web is that:

  1. It derives relevancy from normal web behaviors
  2. It’s publicly known, so that we have a light-weight way to make our content findable
  3. There is a high reward for following the rules and penalties for abusing them

But, for my desktop content, it doesn’t work so well – specifically, there’s hardly any linking, and it’s not clear to me how much desktop search is taking advantage of relevancy cues (document names and headings, email subjects, etc). It may be, but it’s opaque to me. There is no shortage of articles on SEO advice for web pages, but creating findable documents isn’t as ubiquitous a topic. The usual advice is to add meta-data, but I want to just behave normally and have the engine derive the meta-data.

The same goes for enterprise content, but at least with my own documents, I have a chance of remembering something that will help me find them. With other people’s documents, it’s hard without a good taxonomy and diligent tagging.

Wolfram Alpha offers another way, potentially, but only if it can build the database automatically, or with help from content filters (not humans). Since it builds up a kind of understanding of the content and my query, it’s not dependent on keyword matching. For example, here are some queries that I think Alpha could handle if it had a model of my enterprise documents:

  • “Is Pat on vacation next week”
  • “Who was at the last budget meeting”
  • “list of recruiters I have contacted this year”
  • “average response time to forum questions”
  • “how many blog posts did we write last quarter”

And that’s not even close to what I could do with my already structured data (like sales figures or budget data). These kinds of things are just not possible with current search technology and are usually solved by knowing where to find the information, manually collating it, or (mostly) by not bothering.

It looks like Wolfram Alpha is being used in part to drive Mathematica sales – access to data from the platform will be more powerful than just through the Alpha site. I don’t imagine that their back-end is ready for deployment for servers other than theirs, but I’m hoping that they’re considering it and thinking of standard API’s for content management systems to provide data to it.

0 Comments
Filed under: ,

Back in the eighties, David Letterman used to have a segment called Limited Perspective Movie Reviews, where movies would be reviewed by experts that only concentrated on a single aspect of the movie – for example, a dentist would review the teeth of the actors or a mortician would review Creepshow and only talk about how realistically the bodies decayed. In that spirit, I offer this short review of the usage of ECM in the new Star Trek movie.

Spoiler Alert: I have to give away some details of the movie in this review.

It’s a few hundred years in the future and information overload is still basically solved by serendipity. The entire plot hinges on Kirk seeing a similarity between some events from 25 years ago, a quick description of a distress signal from Vulcan, and an overheard conversation about an intercepted Klingon communication.

There doesn’t seem to be much to tie them except a Romulan reference, which would be rare because, according to the original series, there were very few interactions with them. Any contact with Romulans was probably exceptional, so two within a day would be noteworthy, and Kirk had intimate knowledge of the older event.

But, why do they need to rely on luck? Shouldn’t their super-advanced computer systems alert them when there’s relevant information available? Kirk even mentions that the older event is well-chronicled in his “files” and that his captain knows all about it – so the problem is not capture, but something that we haven’t scratched the surface of yet – computer systems that notice connections and bring them to your attention. Even Wolfram Alpha doesn’t do anything with it’s computational model of all knowledge unless you ask it a question.

It seems that like today, they can collect data and search it, but that there isn’t any way for the system to analyze data as it is captured, make connections, and alert. Today, the closest I can come is with Google News Alerts and RSS feeds set up to search Twitter and other sources. But, I have to pick the keywords – I guess I could imagine some kind of Starfleet Twitter where someone is monitoring #romulan and sees the connection, but with so little traffic on the word, it doesn’t seem likely. But, it is its low traffic that makes it interesting this time – perhaps someone parked on Trending Topics would notice it.

So, I guess we’re stuck with that for at least another three hundred years – and being able to make those connections will still be a prized skill that makes one worthy of ridiculously quick promotions.

2 Comments
Filed under: , ,

Last week, I commented on Oracle’s new Office Suite (Open Office was acquired with Sun), and now I read about the new release of Beehive and its price drop (from $120/user to $50/user).

This paragraph from CIO says it all:

According to independent analyst Peter O'Kelly, Beehive represents Oracle's fourth attempt to crack the collaboration market, which has been long dominated by Microsoft and its Exchange and SharePoint products, and IBM with its Lotus Notes and Domino software.

Last updated in 2005, Collaboration Suite "failed to put a dent in the universe," O'Kelly wrote in a blog at Beehive's launch last fall.

The way Oracle talks about Beehive shows their bias to overvalue the middleware/backend:

Also, while "people hate switching, there are immediate hard dollar savings," Gilmour said. He reiterated that Beehive, which stores all data in an Oracle database, is more scalable than competitors. "It is just way better when you're living in a real database," he said.

Microsoft has always led with applications, and brought along the infrastructure with it. As an example, see how Joel Spolsky’s Platforms article describes how Windows was introduced:

The first versions of Windows included a freely redistributable runtime so that if you wrote a Windows application, you could sell it to anyone with DOS, you weren't limited to the few weirdo dorks (me!) who bought Windows 1.0.

People just don’t care about the infrastructure as much as they care about the applications that run on it. If Beehive is to be successful, it has to do that on its strengths as an application. They are clearly making sure to integrate with Exchange and Outlook, but there is no mention of the rest of the Office Suite. You absolutely can’t compete with SharePoint until you fully integrate with Office.

Peter O’Kelly did a round-up of Beehive’s features in September, and I don’t see the other killer SharePoint feature on it – ad-hoc lists. His conclusion is spot-on:

Oracle Beehive is likely to be popular with many Oracle-focused enterprises, especially those using one or more Oracle applications, as Beehive services will be integrated into future releases of Oracle's myriad enterprise application offerings (and are already included in a new Oracle Social CRM product). I doubt Beehive will broadly displace Outlook/Exchange and Notes/Domino for enterprise messaging in the immediate future […]

Beehive looks like it would be fine if you are already on Oracle applications, but I wouldn’t see it as a viable competitor to SharePoint – if this is their fourth attempt, it feels like a golden sombrero to me.

Last night, Atalasoft and Snowtide Informatics sponsored a meeting of the Western Mass Developers Group, which featured Ben Fry speaking about Computational Information Design and the Processing programming language. O’Reilly also pitched in by giving us a few copies of Ben’s book, Visualizing Data, to give away.

The talk was a fascinating journey through Computational Information Design, an interdisciplinary field that encompasses aspects of Computer Science, Mathematics, Statistics, Graphic Design, User Experience Design, and Human Factors Engineering. Visualizing Data goes into more depth, and you can see his early work describing this in his dissertation.

The main crux is that the process of data visualization has these steps:

  1. acquire – the matter of obtaining the data, whether from a file
    on a disk or from a source over a network.
  2. parse – providing some structure around what the data means,
    ordering it into categories.
  3. filter – removing all but the data of interest.
  4. mine – the application of methods from statistics or datamining, as a way to discern patterns or place the data in mathematical
    context.
  5. represent – determination of a simple representation, whether
    the data takes one of many shapes such as a bar graph, list, or
    tree.
  6. refine – improvements to the basic representation to make it
    clearer and more visually engaging.
  7. interact – the addition of methods for manipulating the data or
    controlling what features are visible.

Each of these steps require skills from traditionally different fields – and  Computational Information Design unites them.

A key aspect of his work is the development of Processing, a programming language designed to be accessible to non-programmers, but powerful if needed. It has many simple abstractions to make graphics and interaction programming very simple, and can export to an Applet or Application for easy distribution. Much of his talk was dedicated to showing us some of the visualizations that have been created using it, including some that interact with the real-world in interesting ways. Works ranged from DNA visualizers, to book edition differencers and algorithmic art. OpenProcessing has many examples with full source code.

One of the big advantages that SharePoint has is that it’s tightly integrated with Microsoft’s Office suite. This is something that some vendors (like Alfresco and us with Vizit Scan to SharePoint) have recognized and have worked to implement the same APIs that SharePoint does to integrate with Office.

Oracle now controls the #2 and #3 products best able to understand the internals of MS Office file formats (#1 is, of course, Office itself). In my opinion, Open Office is #2, and Oracle’s Outside In has a great reputation and is powering a lot of ISV products (and presumably their own). Even though the Office specs are now open, they are extremely complicated and we’re not likely to see a lot of new products that are built around them soon – it’s more likely that OO and Outside In will use them to get better, though, and both teams could conceivably learn from each other.

In addition to having this competency strengthened, Oracle now owns a full-blown office suite, which I’d say is the 2nd best out there (iWork is up there, but doesn’t have real word processing), and it does a great job of reading MS Office formats (and many others).

How could they take it to another level to compete with Microsoft? Probably they can’t, but here are some ideas:

  1. Integrate it with SharePoint using those same APIs that Alfresco and Atalasoft are using. I tried to replace Visio, but this one feature is the reason I can’t adopt anything else. That’s the same with Open Office – no company using SharePoint can adopt it.
  2. If they ever get some kind of beachhead with Open Office, perhaps use that to drive people to Oracle’s Universal Content Management – they’d probably have the best chance if it were a SaaS model – Another thing they get with Sun is a deploy-time story that means they should be able to deliver a great service – I said before that Sun needs to exploit this knowledge with a SaaS offering – Oracle has a whole suite of enterprise applications that could benefit from Sun’s expertise at deploy-time.
  3. Of course, they could still do #2 without Open Office (and should) – if they do, they need to integrate with MS Office anyway. They could use their knowledge of Office formats to make it possible deal with Office formats on the server, and not even need an Office install for some use cases.
  4. Obviously, they need to push for more wide adoption of Open Office document standards over Microsoft’s – not so easy, but they have some advantages over Sun.
  5. They need to close their UI gap with Microsoft – Sun doesn’t help here – they are just as bad. I don’t have any suggestions except for crazy ones (acquire Apple or Adobe) – if they want any chance to compete on the desktop, they need some user experience help. This extends to web interfaces – SharePoint is getting a lot of praise from end-users and powers 5 of the top 10 intranets. Most ECM vendors (including Oracle) are conceding the desktop to Microsoft and trying to figure how to power the backend of SharePoint – that’s a big mistake. Microsoft knows how to turn a desktop advantage into a server one – nothing Oracle could do to make SharePoint better won’t be replaced by Microsoft (or a partner) at some point.

About a year ago, I wrote about some of the technical issues around image storage in a database – mostly I concentrated on some of the implementation issues.

Over at, Never Talk When You Can Nod, Andrew Chapman has been laying out the case for taking blobs out of the database and storing them on the filesystem – specifically with regards to SharePoint and documents.

Of course, since Andrew works for EMC, you would expect that he would recommend filesystem based storage, because they offer products based around this kind of storage – in fact a big reason for EMC to acquire Documentum was to acquire a customer base that needed to efficiently store blob-like data. SharePoint’s default behavior of storing the blobs in the database makes a lot of their products less effective.

Given that bias, he still makes a compelling case that at deploy-time, the benefits are too great to ignore:

Imagine a world where content created in SharePoint was automatically routed to the most appropriate location depending on factors such as values in the object's attributes, where the object is in its lifecycle and/or who created it. Imagine that this was done without in any way affecting the SharePoint end user experience or any applications built on top of SharePoint. Imagine if doing this didn't just reduce risk and costs but it also made your SharePoint deployments more scalable and robust.

And then lays out the case, and in a further post, he lays out the options with SharePoint, specifically describing the difference between Remote Blob Storage and External Blob Storage – again, he would want to push EBS, because that’s an API that EMC can write to and get you on their storage solutions, and RBS is implemented internally by SharePoint/SQL Server (using SQL Server’s new Filesystem field type) – however, he does make a compelling argument (e.g. selectivity and transparency). Make sure to read Andrew’s full list of pros and cons of EBS and RBS.

EBS is something we get asked a lot about with regards to Vizit – specifically, does Vizit work with document libraries that are using EBS for the document storage. As Andrew describes, EBS is implemented below the SharePoint Object Model, so that it’s transparent to any add-ons written on top of it – so yes, Vizit, and any SharePoint features written on top of the SharePoint Object Model are completely compatible with EBS (and RBS which is implemented at an even lower layer inside of SQL Server).

I’m headed to the Northeast User Group Summit on Saturday in Boston. I’m involved in the Western Mass Developer Group and Western Mass .NET – I think some other members from both are making the trip.

If you’re from Western Mass or otherwise want to meet, send me an email lou.franco @ the domain hosting this blog.

0 Comments
Filed under:

Jeff Shuey has put 16 years in the ECM space working for Microsoft and Microsoft partners, so when he talks about the partnering with Microsoft, it’s worth paying attention to.  Two recent posts caught my attention:

First, Jeff talks about how partners fit in to the SharePoint ecosystem.

[…] SharePoint is really only complete when partners are involved.

Partners have created hundreds if not thousands of solutions. Some of which are listed here and here, but there is so much more to be done. If executed correctly the potential exists for SharePoint to become the framework that underpins many of the solutions of tomorrow.

As I alluded to above - SharePoint by itself is not the be all - end all solution. However, when combined with the experience, expertise, and energy of the partner community SharePoint really shines.

Of course, as a Microsoft partner with an ECM viewer SharePoint add-on, I’m all ears. The whole post is worth reading to get a sense of what end-users, partners and Microsoft get out of the relationship. Generally, since Microsoft is a platform company, they rely on strong partners to adopt and extend their platform. Jeff’s other post on this topic explains why partners should build on Microsoft platforms:

The 1:3:5 Ratio – Why is it important? This is a ratio I have vetted over the last 10+ years in working with and for both Microsoft and partners to verify and validate the value proposition for both Microsoft and the partner ecosystem. It is a conservative estimate of what partners and Microsoft can expect to generate in revenue from each engagement. When done right this ratio can be increased and more importantly can be made to be extremely predictable and repeatable. This is the golden ticket for Microsoft and Partners.

What does the 1:3:5 ratio mean to Microsoft? to Partners?

The 1:3:5 ratio is a set of numbers that correlates to $1 for an ISV and $3 to Microsoft and $5 to the SI. It relates to what the ISV can expect to make; it tells Microsoft what they can expect to make; and it tells an SI what they can expect to make for each engagement / sale they make with Microsoft products. Again, when done right it makes for a very predictable and repeatable business model. Smart partners know this is the key to their success in the Microsoft partner ecosystem. Microsoft knows that building this ecosystem is in their best interest too.

Since SharePoint sales are at about $1B, that gives a crude estimate of the SharePoint ISV market at $333M.

Microsoft is releasing a version of Windows 7 that is meant for netbooks in developing countries that need an entry-level version of Windows. To do this, they limited the OS to running three applications at once. Ed Bott at ZDNet has done some testing with it, and finds that you can get by pretty well if you are mostly surfing the net and maybe doing one other thing (the limit has loopholes for a lot of common things like control panels, installers, explorer windows).

Anyway, there’s some chatter that this pushes users towards browser-based apps – Silicon Alley Insider even goes so far as to call it a big wet kiss from Microsoft to Google.

I wouldn’t go so far – if you think browser based Office suites are as good as Microsoft Office or OpenOffice, then you should probably be using them anyway (easier doc collaboration, access from anywhere, etc) – the issue is that for a lot of people, they aren't acceptable alternatives, and it’s much easier to just pay for real Windows or go with Linux. They aren’t pushing users to web-based apps, just to better OS’s, and Microsoft already feels like enough people will pick Windows over Linux, so they aren’t too worried about that.

The point of the limitation is to look onerous – even if it’s not.  They only want people who can’t afford real Windows to buy the Starter version – the outrage helps them spread that message (don’t buy Starter). In order for market segmentation to work you have to convince people that can afford the more expensive option to pick it. This is why rebates are usually a pain to get – Joel had a great explanation of market segmentation here.

This is an example of making something that is intentionally broken for people that you don’t want to use it – Seth Godin explains the different way things are broken here.

Windows 7 Starter is designed so that you won’t want it, so if you don’t want it, then ... mission accomplished.

1 Comments
Filed under:
More Posts Next page »