Posts Tagged ‘tutorial’

5 Python Vids

March 30th, 2008

If you’re starting out on Python then you may find the following videos useful:

Basic Python

Advanced Python

Python Design Patterns

Developing a Product in Python

Django Introduction

scRUBYt! Tutorial: Dogs of the FTSE

September 5th, 2007

I’ve been getting into investment a lot recently and so bought a book by James O’Shaughnessy titled “What Works on Wall Street”. Well after a couple of chapters I couldn’t hold off any longer from what I’d read at the end of chapter 1 Dogs of the Dow, You can visit the link for more info but in summary it’s an investment method where you buy the highest yielding stocks on the S&P 500 then refresh this every year.

Well I’m a Brit so rather than try this with the S&P I went for the FTSE 100 which is a similar index. I jumped to it first of all looking to see if Google, Yahoo or any other finance site got me the dividend yields all in one place but alas no. So I decided to give scRUBYt! a go.

So here goes: scRUBYt! being used to Let the Dogs Out!

In this example I use:

* 2 Extractors – One feeding the other.
* Form filling and submitting.
* Page navigation.
* Constraints.

I’ve given the full excerpt of my code below and you can download it here: Dogs of the FTSE

In English first of all though so here’s what I did:

1. First Extractor – This is to get the most up-to-date list of companies in the FTSE 100 index. It navigates to the Yahoo Finance page conatining the list and extracts the stock codes for each one. You’ll notice though that its split across two pages but scRUBYt! handles this fantastically and all it needs is a “next_page” command. You can’t get much more elegant. ;)
2. We need the above stock codes because the dividend yield that I need is on each companies profile page. So each one needs to be retrieved individually after we place the codes into a hash.
3. Second Extractor – Looping through each code we create an extractor then enter the code into the search box of the Yahoo Finance page, “submit” then retrieve the data.
4. Enter each stocks data into a hash with the stock code acting as key
5. The final block at the end just loops through the final dataset and outputs it to the screen. You can do anything with this though such as put it to a file, tabulate it, etc.

Now onto the example – I’ve also added comments to this to help with anything that may be unclear. I’m always willing to answer questions if you post them below.

require ‘rubygems’
require ’scrubyt’
final_data = {}
# We create our first extractor to get the FTSE 100 list from Yahoo
ftse_list = Scrubyt::Extractor.define do
	fetch ‘http://uk.finance.yahoo.com/q/cp?s=%5EFTSE’
	ftse_listing(“/html/body/div/div/table/tr/td/table/tr/td/table/tr/td/b”, { :generalize => true }) do
		stock_code(“/a[1]”)
	end.ensure_presence_of_pattern(“stock_code”)
	# The listing is split across two pages so we go to the next page and repeat
	next_page(“Next”, { :limit => 2 })
end

# All of my scraped data is being put into a hash "ftse_100"
ftse_100 = ftse_list.to_hash

# Now for each ftse listing…
ftse_100.each do |ftse_1|
	# … get the stock code …
	sc = ftse_1[:stock_code]

	# Our final extractor for searching a stock code and retrieving all relevent data (div yield, etc)
	# I get the fieldheader such as "Div Yield" as well as the field data such as "2.3%"
	co_data = Scrubyt::Extractor.define do
		fetch ‘http://uk.finance.yahoo.com/’
		fill_textfield ’s’, sc
		submit

		co_fields(“/html/body/div/div/table/tr/td/table/tr/td/table/tr”, { :generalize => true }) do
			fieldheader(“/td[1]”)
			fielddata(“/td[2]”)
		end.ensure_presence_of_ancestor_node(:td, {“class”=>“yfnc_tabledata1?})
	end

	# Adding all the stock data to a hash with the key being the stock code
	final_data[sc] = co_data.to_hash
end

# Here you can do what you want with this final set of data you got above.
# Here I’m just outputing the "Dividend Yield" for each stock.
# Note my check for the string "yield"
final_data.each do |key, entry|
	puts “\n\n#{key}”
	entry.each do |dataset|
		if dataset[:fieldheader].include? “Yield”
			puts “#{dataset[:fieldheader]} #{dataset[:fielddata]}”
		end
	end
end

Source available here.