ISO FileMaker Magazine: FileMaker Video Tutorials, Templates, Help & More

Navigation

Video Browser

Scriptology Video Browser

Tools & Resources

Automatic Text Extraction

Posted by: Editor / Friday, February 16, 2007 – 4:18pm

by Matt Petrowsky

29
 minutes

One of the most powerful things about data is it can be pulled, pushed, divided, combined and pretty much manipulated like Play-Doh. One of the keys to knowing how to manipulate data is knowing a technology called Regex or Regular Expressions.

In this video article I show you how to structure a solution which makes it very easy to quickly grab any number of elements from copied text. All a user has to do is copy the text and then click a button. The rest is automatically extracted using regex patterns and the SmartPill PHP plug-in. Don't delay in learning Regular Expressions. They're here to stay and they're now available within FileMaker. This video will give you insights into how you can use these two powerful technologies within your database.

P.S. Looking for the definitive book on Regular Expressions?

Details: Released - 2/16/2007 / Size - 29.55 MB / Length - 29 min
About author

Matt Petrowsky is the Senior Editor for ISO FileMaker Magazine. Matt has been involved with FileMaker Pro since the early '90s. Having authored many articles, a popular book, spoken at conferences and seminars, as well as provided private training, Matt is continuously updating his knowledge and skill about the powerful FileMaker platform. You can contact Matt by sending email to editor@filemakermagazine.com.

Filed under: |
-
.

Can get other pattern

.
.
.

Hello everyone !

it seem a very powerfull technique.

I try it, got some nice results but meet a wall (I was stopped in my effort (without damage :-) :

We have a online store and receive the order via email.

I was able to extract a lot of information of this email, but my effort was stop without solution until now.

We have the following pattern in our PO :
...
Address
ThisCustomer_Address1
ThisCustomer_Address2
ThisCustomer_City, ThisCustomer_PostalCode
ThisCustomer_country
...
(off course all ThisCustomer are real information concerning the customer)

We like to extract this :
address = ThisCustomer_Address1
ThisCustomer_Address2

City=ThisCustomer_City

Postal_code=ThisCustomer_PostalCode

country=ThisCustomer_country

In fact we meet 2 problem's :

--1--
I was not able to get multiples lines with regex ... There is maybe a tag to tell regex 'please my pattern use carriage return'. (we cannot accept all carriage return in the answer but only one in this case)

--2--
I don't see how make a regex to get the City.
Maybe a regex with 'search ADDRESS word at the starter point', then skip 2 carriage return and then get char's until you reach a ',' ???

If someone know the solution this will be very helpfull and will help us to validate this powerfull technique.

Thanks in advance to everyone (and a special one for Matt :-)

Philippe

.
.

Advanced Regular Expressions - ADMIN POSTED THIS

.
.
.

Philippe, there is very little that regex can't extract, as long as you learn about all of the possiblities. When I was first learning to even read/understand regex it took me about a week. It truly is like learning a foreign language. Eventually you can just read all the gobbledygook.

Some of the things you'll want to look into are the following

Mode Modifiers: In the video, where I talk about the i meaning case insensitive - that is a mode modifier. You can use a modifer of m for multiline mode or s for single line mode. It would look like this /somepattern/m or /somepattern/s You'll need to research what would meet your needs.

Also keep in mind that you can do a search within a search to break out what you need. So searching for the Address and capturing that would allow you to run another search in multiline mode.

If you really want to get advanced, then you'll need to learn about things like "greedy" vs. "non greedy" and lookaround's (properly known as lookahead and lookbehind).

Learning all of the nuances is what will make regex the most powerful - and essentially, you'll need to modify/enhance the technique to fit your needs. But don't get down, it's totally possible to do!

Sincerely,

Matt Petrowsky

.
.

Another great regex reference - ADMIN POSTED THIS

.
.
.

I left a note in the article synopsis about the full Mastering Regular Expressions book.

What I didn't mention is that I use the Regular Expressions Pocket Reference any time I need a quick place to check the syntax of regex. It's a been a great help over the years!

.

Be Notified!

Let us tell you when a new video is posted. We'll send you an email with a direct link right to your email inbox.
Make sure and whitelist (or add to your address book editor@filemakermagazine.com

Give your FileMaker interface a beautiful overhaul, make your layouts look great!Theme Box 150 Take control of your users' permissions using the Scriptology Permissions Template!Perms Box 150

iPod Video

Magazine IPod Small
Click to watch the video

FileMaker Reference Tool