Pattern Detection

Pattern Discovery and Detection


Recently I have been trying to solve the problem of finding common word patterns in strings of text.


Most pattern tools assume you already know the pattern that you wish to locate, and then allow you to search for matches based upon that pattern (such as regular expressions). The problem I faced was that I needed to find patterns as opposed to match them.


I scoured the web for tools to perform this job to no avail. I decided to look for solutions to similar problems. A friend suggested the field of genetic research, where finding commonly occuring patterns is a relatively common task.


I found several cases where pattern discovery used a matrix to solve the problem.


First, one pattern is placed across the x axis of a matrix, and one is placed across the y axis.


G







A







T







T







A







C







A







T







A








G

T

A

A

C

A

Then a mark is placed on each location in the matrix where a match is found on its corresponding x and y axis.


G

X






A



X

X


X

T


X





T


X





A



X

X


X

C





X


A



X

X


X

T


X





A



X

X


X


G

T

A

A

C

A





Finally, the matrix is examined to find diagonal lines of markers running from bottom-left to top right


G

X






A



X

X


X

T


X





T


X





A



X

X


X

C





X


A



X

X


X

T


X





A



X

X


X


G

T

A

A

C

A


In this example we can see that the longest pattern common to both is ACA. We have discovered this pattern without any prior knowledge of what we were looking for.


I decided to try this match versus my original problem, finding phrases within text.


mat




X




the



X



X


on


X






sat








cat







X

The



X



X



Sit

on

the

mat

said

the

cat


The common patterns “the cat”, and “on the mat” were successfully detected. This technique works better for detecting text as there are more of a variety of possibilities thereby reducing 'noise'.


As a side-note, when performing the matching process using a computer, storing the matches in a matrix is quite wasteful. The matrix is populated sparsely, and in diagonal lines. If the matrix is transformed by rotating it 45 degrees clockwise, the lines become horizontal, more obvious for detection, and easier to store as coordinates/vectors rather than a bulkier 2D array.


This can be also used to find commonly occuring patterns in a single document by placing the same document on the x and y axis, and ignoring the centre coordinates (1,1) , (2,2) , (3,3) etc.

BladeCaster Project


I have recently been involved with the development of BladeCaster. BladeCaster is a web scraping tool which tidies content using Dave Raggets Html Tidy and then allows users to create custom XPath statements to extract the text from web pages and RSS feeds.

BladeCaster was originally born from the need to capture web content, and re-present it for mobile devices such as mobile phones or the Sony reader when disconnected.

It's currently in the alpha stages at the moment, if you're interested, why not head over to the BladeCaster homepage, where you can download a free copy to play with.

Thoughtex - the WPF mind mapping tool

I came across what is possibly the most useful use of the Windows Presentation Foundation I have seen so far. Thoughtex is a fantastic new free mind mapping tool that uses WPF to draw its maps.


It feels very intuitive to use even though it's still in beta. The whole look-and-feel is very reminiscent of windows media centre.

Thoughtex supports images, links, and rich text inside the mind map boxes/nodes.

Yahoo search is integrated into the application, giving a slick selection of relevant information as you type.

To put the icing on the cake, a spellchecker is also included which highlights you typos in Microsoft office style.

It's a refreshing change to see such a beautiful, well thought out application. I for one will be replacing Freemind now as I have a native alternative. Thoughtex is definitely one of those must have tools. If the software is this good now, and development is still in beta - I can't wait to see what the finished product offers.

Installing Widcomm Bluetooth drivers under Vista x64


After upgrading my machine to Vista x64, I found my Broadcom Bluetooth 2.0+EDR driver had not been detected. This was no real surprise - I was actually expecting more problems than this. I visited the Broadcom site to download their latest Bluetooth drivers.

After the 33M download, the Broadcom Bluetooth installer tried to run, and fell over.

I scoured the Web looking for assistance to no avail. Finally I found a Lenovo site offering x64 drivers for their Broadcom Bluetooth software. I installed the x64 Bluetooth driver part, of the package and re-tried the Broadcom driver, and it installed seamlessly.

Here's the instructions in case anyone else is having the same problems:

1. Get the 68bu10ww.exe driver from the Lenovo support site.

2. Double click the package to extract the drivers.

3. Open device manager, and select the Broadcom driver, update driver, and choose the x64 subfolder from the extracted Lenovo package (it defaults to C:\DRIVERS\Vista\BTOOTH\Win64).

4. once the drivers are installed go to the Broadcom Bluetooth update page, and download and update their driver. This is necessary if you want all of their great Blutooth profiles i.e. headset, remote etc.

Vodafone UK withdraws its Pay-as-you-go email service


www.vodafone.net Email services are no longer available to pay as you go customers.

After many attempts to sign up to 'Vodafone Live' email services, Vodafone customer services informed me in this mail conversation that new PAYT customers are no longer allowed vodafone mail services.

After my discussion with them a disclaimer appeared on the www.vodafone.net site indicating that this was the case. No public statement has been issued to Vodafone PAYT customers, nor is this stated on the vodafone.co.uk website.

However in the last part of the mail conversation, the customer service representative adds, that concerns from customers will be taken into account. So, if you are a PAYT vodafone customer, and want to gain/retain your mail facility it's worth dropping them a line at customer.care@vodafone.co.uk.

Fixing the mess TweakVI left behind

Monitor Image

I decided to install TweakVI today as I found TweakUI to be a very useful and stable product. Upon installing, it gave me the option to backup all of my systems' settings. After doing this, I thought a reboot was in order.

I was annoyed to find out that TweakVI had made changes to my system configuration without me asking. It had mucked up the "My Computer" view. It was now showing "dummy" blank files. These dummy files were quite a frustration seeing as I couldn't view their properties, or in fact, delete them. In addition to this, my Windows Experience Index was no longer displaying, and I was receiving random Explorer crashes.

I immediately reloaded my backed up configuration files, uninstalled TweakVI, and rebooted.

I was horrified to find out that these changes were made to my system configuration upon install before I was prompted to create a backup, and that the backup copies (supposedly made before TweakVI could damage anything) contained modifications.

For the best part, all the articles on the web that had covered this problem recommend changing the values from TweakVI - something that I really didn't want to do as it caused this mess in the first place.

However, I found a couple of articles that helped me clean up the mess left by TweakVI. (Fixing the performance index, Fixing the "dummy files")

I can't believe that a commercial software manufacturer that is recommended in so many reviews can make changes to my windows settings without my permission.

All I can hope is that their system test team has been sent to the wall. And of course that these are the only things that TweakVI has broken - only time will tell.

Software architecture

I’ve been thinking recently on the subject on what group of qualities are required from a software architect. I ran through the usual things that crop up in definition of the profession such as deep working knowledge of:


  • Patterns and practices.

  • Tools and technologies.

  • Design and development methodologies.

  • Software engineering.

Then I started thinking of the additional skills that usually get missed when recruiting, the things that give a successful architect the “X-Factor”.

After some thought, I came up with this list of skills which are often missed:

  • Charisma – Business users, technical team, analysts and product managers need to be able to buy in to the system. The key to this process is the ability to sell a system, and inspire confidence.

  • Vocabulary switching skills – stakeholders and business people need different language from a technical team for instance. When developers receive a high level overview from an architect, they can sometimes be left without confidence in the solution.

  • Drama and acting ability –successful architects should go further than just taking a view of systems from the perspective of the users. They should actually step into each role, forgetting all of their systems knowledge, and take on the users’ expectations and needs.

  • Market Insight – Any well architected solution should not contain redundant parts, but should be easily extended with minimal effort. An architect should see this possibility, and not design in a way that could possibly inhibit this.

  • Self discipline – All too often, architects place art above practicality. A solution should contain exactly as much technology as is needed by the business requirement. A software architect must be able to appreciate the beauty and ingenuity of a simple solution to a complex problem.

  • Accepting feedback – An architect should encourage the business users and technical team to seek flaws in their design. This strengthening process is useful in building quality software. A strong character is required, but the architect should be ready to adapt the solution in the case of design issues.