Mubi's Blog

Friday, March 25, 2011

What's New in iPad 2

Apple has released new iPad 2 as promissed. Following are few features that you will find in new iPad 2.

1. iPad 2 has new design and it is 33% thinner and 15% lighter than the first iPad.

2. iPad 2 runs on a new dual-core A5 processor.

3. iPad 2 has 2 cameras. One is a front-facing VGA camera and the other one is a rear-facing camera that allows users to capture 720p HD videos. The front-facing VGA camera supports Apple’s FaceTime and iPad 2 is the first iPad to get this.

4. Comes with iOS 4.3 (the latest version), promises faster mobile browsing and etc.

5. Has a built-in gyroscope that enables advanced gaming.

6. Supports HDMI Video Mirroring, allowing users to stream media on HDTVs.

7. iPad 2’s launch also saw the introduction of 2 new apps; iMovie and GarageBand. In a nutshell, iMovie allows you to shoot and edit videos with your iPad 2 and upload them to video sites. GarageBand on the other hand will make a musician out of you.

In Ireland you can buy it from here... http://store.apple.com/ie

Wednesday, March 23, 2011

Monirobo Measures Radiation in Japan

According to a report by a Japanese news agency, a radiation monitoring robot, aptly named Monirobo, is the first non-human responder to go on-site following the partial meltdown at the Fukushima Daiichi nuclear power plant. The machine, which was developed by Japan's Nuclear Safety Technology Centre to operate at lethal radiation levels, reportedly began work Friday, enlisting a 3D camera, radiation detector, and heat and humidity sensors to monitor the extent of the damage. A second Monirobo, used to collect samples and detect flammable gases, is expected to join its red counterpart soon -- both robots are operated by remote control from distances up to one kilometer away. They join the US Air Force'sGlobal Hawk drone in unmanned surveillance of the crisis.

Friday, March 11, 2011

Microsoft® Community Contributor Award

Dear Mubshir Raza,

Congratulations again for being recognized with the Microsoft Community Contributor Award!

The Microsoft Community Contributor Award is reserved for participants who have made notable contributions in Microsoft online community forums such as TechNet, MSDN and Answers. The value of these resources is greatly enhanced by participants like you, who voluntarily contribute your time and energy to improve the online community experience for others.

Becoming a Microsoft Community Contributor Award recipient includes access to important benefits, such as complimentary resources to support you in your commitment to Microsoft online communities. To find out more about the Microsoft Community Contributor Award and to claim your recognition, please visit this site: http://www.microsoftcommunitycontributor.com/

Thank you for your commitment to Microsoft online technical communities and congratulations again!

Nestor Portillo
Director
Community & Online Support, Microsoft

It was great news. A Big Thanks to Microsoft for the Recognition!

Currently rated 5.0 by 1 people

Sunday, February 13, 2011

MAVIS

The Microsoft Research Audio Video Indexing System (MAVIS) is a set of software components that use speech recognition technology to enable searching of digitized spoken content, whether they are from meetings, conference calls, voice mails, presentations, online lectures, or even Internet video.

About MAVIS

As the role of multimedia continues to grow in the enterprise, Government, and the Internet, the need for technologies that better enable discovery and search of such content becomes all the more important.

Microsoft Research has been working in the area of speech recognition for over two decades, and speech-recognition technology is integrated in a number of Microsoft products, such as Windows 7, TellMe.com, Exchange 2010, and Office OneNote. Using integrated speech-recognition technology in the Windows 7 operating system, users can dictate into applications like Microsoft Word, or use speech to interact with their Windows system. The TellMe.com service allows mobile users to get directory services using speech while on the go. Exchange 2010 now provides a rough transcript of incoming voicemails and in Office OneNote, users can search their speech recordings using keywords.

MAVIS Adds to the list of Microsoft applications and services that use speech recognition. MAVIS is designed to enable searching of 100s or even 10,000s of hours of conversational speech with different speakers on different topics. The MAVIS UI, which is a set of aspx pages, resembles that of a web search UI as illustrated below but can be changed to suite different applications.

MAVIS comprises of speech recognition software components that run as a service in the Windows Azure Platform , full text search components that run in SQL Server 2005/2008 sample aspx pages for the UI and some client side PowerShell and .NET tools. The MAVIS client side tools make it easy to submit audio video content to the speech recognition application running in the Azure service using an RSS formatted file and retrieve the results so they can be imported into a SQL Server for full text indexing which enables the audio video content to be searched just like other textual content.

MAVIS is currently a research project with a limited technical preview program. If you have deployed Microsoft SQL Server, have large speech archives and are interested in the MAVIS technical preview program, contact us.

MAVIS Architecture

Speech-Recognition for Audio Indexing Backgrounder

There are two fundamentally different approaches to speech recognition, one referred to as Phonetic indexing and the other Large-vocabulary Continuous Speech Recognition (LVCSR).

Phonetic indexing is based on phonetic representations of the pronunciation of the spoken terms and has no notion of words. It performs phonetic based recognition during the indexing process, and at search time, the query is translated into its phonetic spelling which is then matched against the phonetic recognition result. Although this technique has the advantage of not depending on a preconfigured vocabulary, it is not appropriate for searching large audio archives of 10,000s hours because of the high probability of errors using phonetic recognition. It is however appropriate for relatively small amounts of audio as might be the case for searching personal recordings of meetings or lectures. Microsoft has utilized this technique with success to enable the “Audio Search” feature in Office OneNote 2007.
Large-vocabulary continuous speech recognition or LVCSR, which is used in MAVIS, turns the audio signals into text using a preconfigured vocabulary and language grammar. The resulting text is then indexed using a text indexer. The LVCSR technique is appropriate for searching large amounts of audio archives which can be 10,000s of hours in length. The vocabulary can be configured to enable recognition of proper nouns such as names of people, places or thing.

Although LVCSR based audio search systems can provide a more accurate search result than phonetic based systems, State-of-the-art LVCSR based speech-recognition accuracy on conversational speech is still not perfect. Researchers at MSR Asia have developed a more accurate technique called “Probabilistic Word-Lattice Indexing” which takes into account how confident the recognition of a word is, as well as what alternate recognition candidates were considered. It also preserves time stamps to allow direct navigation to keyword matches in the audio or video.

Probabilistic Word-Lattice Indexing

For conversational speech, typical speech recognizers can only achieve accuracy of about 60%. To improve the accuracy of speech search, Microsoft Research Asia developed a technique called ”Probabilistic Word-Lattice Indexing,” which helps to improve search accuracy in three ways:

Less false negatives Lattices allow to find (sub-)phrase and ‘AND’ matches where individual words are of low confidence, but the fact that they are queried together allows us to infer that they still may be correct. Word-lattices represent alternative recognition candidates that were also considered by the recognizer, but did not turn out to be the top-scoring candidate.
Less false positives Lattices also provide a confidence score for each word match. This can be used to suppress low-confidence matches.
Time stamps Lattices, unlike text, retain the start times of spoken words, which is useful for navigation.

Word lattices accomplish this by representing the words that may have been spoken in a recording as a graph structure. Experiments show that indexing and searching this lattice structure instead of plain speech-to-text transcripts significantly improves document-retrieval accuracy for multi-word queries (30-60% for phrase queries, and over 200% for AND queries).

For more information about Microsoft Research Asia’s basic lattice method, see:

A challenge in implementing probabilistic word-lattice indexing is the size of the word lattices. Raw lattices as obtained from the recognizer can contain hundreds of alternates for each spoken word. To address this challenge MSRA has devised a technique referred to as Time-based Merging for Indexing which brings down lattice size to about 10 x the size of a corresponding text-transcript index. This is orders of magnitude less than using Raw lattices.