Tesseract

Tesseract is a fine FLOSS OCR (Optical Character Recognition) package that can help us expose the truth about M$ to the public through Google. Google indexes text pages very well but many of the .PDF documents in records of courts are images of documents that have been photocopied a few times too often. Tesseract can convert them to text.

Groklaw has been doing a fine job of this in Comes v M$. I thought I would spread some joy to US DOJ v M$.

Here is an example: http://www.usdoj.gov/atr/cases/exhibits/365.pdf.
I convert it to TIFF images and turn Tesseract loose on it to get text. Debian has all the packages I need. Here is the result:

“$ls

365.pdf


$pdf2tif  365.pdf

$ls
365-01.tif  365-02.tif  365-03.tif  365-04.tif  365.pdf

for  f  in *.tif; do  tesseract  $f  $f; done
Tesseract Open Source OCR Engine
Tesseract Open Source OCR Engine
Tesseract Open Source OCR Engine
Tesseract Open Source OCR Engine
$ ls
365-01.tif  365-02.tif  365-03.tif  365-04.tif  365.pdf  365-01.tif.txt  365-02.tif.txt  365-03.tif.txt  365-04.tif.txt


cat *.txt  > 365.txt”

Here is the text after fixing glitches:

From: Joachim Kempin

Sent: Tuesday, December 16, 1997 5:37 PM

To: Bill Gates

Cc: Steve Ballmer; Paul Maritz; Joachim Kempin

Subject: As promissed OEM pricing thoughts

draft OEM DTOS

PRICING.doc feedback appreciated

MS7 007193

CONFIDENTIAL

DT OS pricing strategy

During our offsite last weekend the OEM team discussed this issue and this is a summary of our

conclusions.

Current situation

The current ASAP decreases for PC manufacturers will make us a much higher component of their system cost then ever before. We expect that <1k PCs will be bought by consumers and business and could constitute more then 50% of all PCs by C-mas of 1998. In case we see $500 PCs be next C-mas our royalties we could be as high as 10% of total system prices and if the biz PC markets gets eroded by <1k PCs we will with an NTWN solution be in the same position. While we have increased our prices over the last 10 years other component prices have come down and continue to come down. This is in particular true for CPU prices. where AMD and CYRIX are clearly under $50/unit components with packaging COGs of $20-25. Intel has higher costs today because of their packaging and I estimate that their current average CPU price is around 170-180$ with 40-60 $ in packaging costs (so the money they are getting for their IP on silicon is $120-140 in average, which compares with NTW prices being between $100 and 120 typically). I am interested in listening to them when they explain to us their low-end strategy in Dec. When comparing system prices over the last couple of years with today’s prices we should note that in the low end segment PC manufacturers have started pulling out monitors and other items from their systems. As a result my comparison is not 100% correct- but this does not change the trend. We have to assume that not all of the <1k PCs will be less powerful or just consumer focussed. Many will be less capable as OEMs strip peripherals. We are making this easy with USB, 1394, device bay etc. Easy transfer of peripherals to new PCs could be the result and the vanilla core architecture might get artificially even cheaper. At least this would make it easier for us to defend our pricing as we increase our BOM percentage.

OEM division revenue growth over the last 8 years has depended heavily on volume increases and a trend to higher priced OS. During that time ASPs have stayed stable or have gone up which made it easier to ride the wave and get the value we deserve. We have shown larger then 40% growth rates annually and expect in the future that OEMs will take a very hard look in how to avoid paying us more $5 per system in order to hit most aggressive price points. Will this lead to significant higher volumes and thus allow us to relax some prices while gaining share where we need it? The danger does exist that more PCs might get shipped without an OS and we should not take this lightly!

While reasons for volume increases are too early to analyze (US data still sketchy and ASIA/LATIN data really convoluted) we expect the following to happen:

1. Moderately more volume by Ending new buyers who can now afford to buy PCs

(This should be true for consumers as well as small biz)

2. Acceleration of replacement cycles

(Knowing that 80M PCs cannot run NTW or WIN 98)

3. Shortening of PC “life time” in general

The only counter argument to make here is that current PC technology is totally sufficient for most office tasks and consumer desires and that any performance bottleneck is not in today’s PCs but in today’s COM pipes. This in itself might slow down replacement cycles and life time shortening until we find true MIPS eating applications- a priority not only INTEL should subscribe to.

Other side effects of the <1k PCs are less need for NCs, NetPCs and WIN terminals as long as we deliver on the well managed aspect of the PC environment within 12 month. If not customers might not wait for us and pilot more alternative solutions. I do not have to say what this means for NT 5.0 delivery.

MS7 007194

CONFIDENTIAL

Pricing options .

PC industry growth after the Asian crisis settles down should go back to normal and might wind up for CY 99 and 2000 in the 20%+ figures. This could help us to ease up on increasing prices – but the drive to N’l’W needs to continue and as we go along we might conclude that the market will not bear $100+ prices for NTW. Our options,

1. Peg DT OS prices to type of CPU or system price

Both methods are an administrative nightmare for the OEMs and us. This worked when we had only 3 CPU types and the one with the higher royalty had a long cycle time ~ today we have too many types (I can just Intel calling me feeling we treat them unfairly by putting all their competitors into the low end bucket) and the cycle times are so unpredictable that we recommend against this. We have priced once on manufacturer cost and it is a sure way to totally erode your model without having any control. We rejected this as well.

2. License for limited time and create annuity business

This is the best thing long term but it might disrupt end user operations and could require enduser registration. I wrote a memo about this more then a year ago. This will need technology and infrastructure to be set up something we are not seriously working on. So until NTW 6.0 comes out- say CY 2001 this is not an option. We need a champion for this now, if we want to do this.

3. Reduce DT OS content which OEMs install and sell add on retail packs

This is a viable option if we can make the add-on pack a stunning piece of technology and a “must have” for every PC owner. Performance, management and ease of use features come into my mind. Again we need to start this now in order to be ready at NT 6.0 time frame.

4. Defend current model

We believe that we ducked the bullet for 15-18 month and bought some time to explore the above opportunities. Only 3 major contracts are not agreed upon. The one company who is pushing the hardest for a price break for the sub 1k category is Compaq and I expect a major fight and escalation on this subject. The answer here has to be “no” for all people involved.

With this in mind Iet’s agree on the following objective, strategy and tactics:

Objective:

To get the highest amount of $/unit for DTOS through the OEM channel without breaking the current model of pre-installing the SW on PCs.

Strategy:

Avoid price increases for DT-OS over next 2-3 years and be sensitive to NT pricing and prepared to revisit as we go along.

Tactics:

– Reduce some of the more rigid licensing requirements, which increase costs to the OEMs.

– Step up our marketing efforts with OEMs to help them to sell more PC units

– Give OEMs air cover by promoting high-end PCs purchases by providing more future technology directions

– Continue to level the playing held between SB, Named and MN accounts.

– Increase demand creation for NTW PCs slowing down OEM’s ASP erosion

– Resist <1k PC royalty price decreases firmly

– Reward OEMs who are willing to increase their NTW penetration until NTW 5.0 ships

– Review MOLP and SELECT waterfall as well as Academic and special government pricing options and agree on a company wide pricing. model without allowing any exceptions on subsidiary or area level

MS7 007195

CONFIDENTIAL

Who can derail this plan and MSFT counter tactics:

OS competitors

SUN

Sun and it’s coalition with Java. For the next 2-3 years the barriers are huge for them and even IBM after studying this technology is not convinced it would satisfy customers when implemented during that time frame. In addition there is the compatibility barrier and the fact that OEMs see SUN as the enemy and will not be easily convinced to be a distribution channel for them.

OEM coalition

Our high prices could get a single OEM( Compaq might pay us 750M$ next year) or a coalition to fund a competing effort (say in India). While this possibility exists I consider it doubtful even if they get a product out that they can market it successfully, leapfrog us and would not deviate from their own standard to differentiate. Could they convince customer to change their computing platform is the real questions. The existing investments in training, infrastructure and applications in windows computing are huge and will create a lot of inertia.

No bundling of OS on low end systems would be the easiest way to hurt us- but who would want to start with this and loose business’?

ISV

NSCP may come from the browser side, but I consider them too weak to succeed alone- so they are only dangerous if they team up with SUN. Again compatibility and yet another platform are the biggest inhibitors.

INTEL

We read about it in the news today and over the last couple of weeks. If they decide to own the OS as well as the CPU our business it will get ugly. This could be an INTEL lead and funded coalition- say with Compaq and NSCP. I am convinced they have been thinking about this for some time. They could buy SUN SOFT or start a skunk work project on their own. lf they decide to sell the OS for $1 and the CPU for $ 200 they will get the OEMs on their side. The customer inertia argument remains and that will prevent them to build momentum easily. Our reaction could be to buy Nsemi or AMD or both and own the CPU and the SW business- while both stocks (INTEL and MSFT) are taking a dive. We would sell SW at $100 and CPUs at costs + $1.

How sure are we of our partnership and how fast could we react if needed? We could bring compatibility to another platform better then anybody else and we would have the money to fund the fab capacity.

Bill, please send me some feedback, does it make sense to discuss this with a larger

audience’?

MS7 007196

CONFIDENTIAL

Isn’t that interesting? While M$ was telling the world it was innovative, it was looking at ways to stifle the competition and to sell consumers stuff they did not need. Note that M$ felt pressure from sub $1000 PCs in 1997. How must they be sweating with PCs at $100-$300 and M$ have raised prices again? Well, the higher-priced units are not selling. The lower-priced units are having to cut prices to compete. That means the cash cow is drying up. Do you really need a quad-core CPU and video card that can do 200 frames per second 3D? Do you see competition in the market or do you see OEMs, Intel and M$ colluding to keep prices high? This year, OEMs will be under a lot of pressure to dump M$ because ARM will sell and do it all without the OEMs and without M$. To keep moving units, OEMs will have to cut prices. Hardware is already at rock-bottom, so the cut will have to come in software. Good-bye M$. Hello GNU/Linux.

Business basically buys no-OS PCs in bulk and writes disc images to them or uses thin client technology. They really have no need for the M$ tax to raise their cost of acquisition. Businesses compete. If the competition adopts GNU/Linux and thin clients, others will follow. It’s happening.

About Robert Pogson

I am a retired teacher in Canada. I taught in the subject areas where I have worked for almost forty years: maths, physics, chemistry and computers. I love hunting, fishing, picking berries and mushrooms, too.
This entry was posted in technology. Bookmark the permalink.

2 Responses to Tesseract

  1. Yes. There are still a few people running word-processors on machines of that era and they are fine. You have to give credit to salesmen who can sell something not needed but some of the tactics and the scope and depth of the tactics is disgusting. Not only have they tied up the OEMs and retailers to offer little choice, they go out of their way to make PCs slow down with bloat/malware/non-features.

    I have used P1s as thin clients and as long as I had 100 mbits/s NIC, it was fine. Anything above 400MHz is waste for many purposes. The trend pushed by Intel and AMD to get ever-more-powerful processors when there is a single 7200 rpm hard drive inside is wasteful. Those processors are idling most of the time. I can run 30 simultaneous users on many modern processors and each feels they are running smoothly on a powerful machine. My terminal server here is a 2001-era server with a single 32bit processor with 1gB of RAM and it blows away XP on years newer machines. On side-by side tests of speed GNU/Linux beats XP every time, but if consumers don’t see that they cannot choose.

  2. Michael says:

    The most interesting statements to me:

    “The only counter argument to make here is that current PC technology is totally sufficient for most office tasks and consumer desires and that any performance bottleneck is not in today’s PCs but in today’s COM pipes. This in itself might slow down replacement cycles and life time shortening until we find true MIPS eating applications- a priority not only INTEL should subscribe to.This in itself might slow down replacement cycles and life time shortening until we find true MIPS eating applications- a priority not only INTEL should subscribe to.”

    A decent desktop PC of that era was specced significantly lower than even a beagleboard, let alone netbooks which exceed that. 266Mhz `MMX’ (and alpha) was the hot new thing back then. http://www.cpushack.com/CIC/announce/1997.html and they probably had 64MB or so of RAM in a typical high-end-desktop config.

    Yet the common thread from msm and the big vendors is that these netbooks are just ‘jumped up typewriters’ or simply junk.

Leave a Reply