Tag: Erlang

Download Youtube video

January 11, 2009

How to download Youtube videos has been done by a number of bloggers in a number of languages. I just wanted to try it out myself in Erlang, but I had no idea where to start.

Now that I’ve decided to start, let me grab an Eddie Vedder song from Youtube:

http://www.youtube.com/watch?v=gct6BB6ijcw

And lets sniff the HTTP traffic using Wireshark to see what actually happens when we watch a Youtube video. The browser made a conection to Youtube and before it started streaming the video, the input URL got redirected to a different Youtube URL (Depending on whether Youtube is caching the video or not, it could in turn probably be redirected to a different Cache server or a different IP):

No.     Time        Source                Destination           Protocol Info
     96 3.295105    xx.xx.xx.xx        208.65.153.253        HTTP     GET /get_video?video_id=gct6BB6ijcw&t=OEgsToPDskJ6n06uQXzbbyp7xAnxK6pN&el=detailpage&ps= HTTP/1.1

No.     Time        Source                Destination           Protocol Info
    115 3.879454    xx.xx.xx.xx        209.85.239.30         HTTP     GET /get_video?origin=lax-v113.lax.youtube.com&video_id=gct6BB6ijcw&ip=xx.xx.xx.xx&region=0&signature=587F68CED7B14F380192AAB1D58942F0EAB9AE7B.6379C474E29D2B2348E1A69954D7FACFC461F964&sver=2&expire=1231731436&key=yt4&ipbits=0 HTTP/1.1

Upon playing with the new URL in the browser, I realized that the URL that Youtube gets the video from is

http://youtube.com/get_video?video_id=gct6BB6ijcw&t=OEgsToPDskJ-EKxTpxj79WK0fWWs_YjO

The param “video_id” is the same as the param “v” in the original URL. So we only need to find the value for the param “t”.
Lets look at the HTML source for

http://www.youtube.com/watch?v=gct6BB6ijcw

and see if it contains the value for the parameter “t”. Luckily, grepping for “&t=”, I found this in the source:

&t=OEgsToPDskKrsk1Xwku653CJbAXXrdJb

The value of “t” in the HTML source and the value of “t” in the redirect URL is different, but I found that both values of “t” seemed to work when appended to the URL. Oh well, I tried it again just to make sure and realized that the value of the parameter “t” changes for every request, but all values seem to work(Probably it is timestamp dependent!).

So we have a plan now:
1. Get a Youtube video URL and make a HTTP request to it.
2. Get the body of the reponse and find the value of “t” using regex pattern matching.
3. Generate the proper redirect URL using the two parameters “video_id” and “t”.
4. Make a http request to the new URL and stream the bytes and write to a file with “.flv” extension.

Here is the full source code in Erlang: 

-module(video_downloader).
-export([download/1]).

download(URL) ->
    {ok, {_Status, _Header, Body}} = http:request(URL),
    Video_URL = get_video_download_url(URL, Body),
    stream_video(Video_URL).

stream_video(Video_URL) ->
    io:format("Downloading video from ~p~n", [Video_URL]),
    {ok, {_Status, _Header, Body}} = http:request(Video_URL),
    file:write_file("myvideo.flv", Body),
    io:format("Download complete!").

get_video_download_url(URL, Body) ->
    Matcher = "&t=[A-Za-z0-9-_]*",
    {match, Start, Length} = regexp:first_match(Body, Matcher),
    T = string:substr(Body, Start, Length),
    {ok, New, _No} = regexp:sub(URL, "watch\?v=", "get_video?video_id="),
    New ++ T.

Lets run it from the Erlang shell:

1> c(video_downloader.erl).
{ok,video_downloader}
2> inets:start().
ok
3> video_downloader:download("http://www.youtube.com/watch?v=gct6BB6ijcw").
Downloading video from "http://www.youtube.com/get_video?video_id=gct6BB6ijcw&t
=OEgsToPDskLUZgy2pfyoRf-AtXCdHhYG"
Download complete!ok

Go to your current directory and use any Flv Viewer to see if the downloaded file is a working video or convert it to format of your choice and watch it offline.

I would love to give an Erlang twist to it by spawning a few concurrent processes to download videos, but this is not quite a good example to do it from by localbox – too much of Disk IO, Network IO and slow Internet connection.

Erlang – overhyped or underestimated?

November 14, 2008

Erlang is like an exotic beautiful woman with no dressing sense.

I came across Erlang about a year ago. It is a language of a kind – precise, crafted and powerful. Its been there for about two decades now, and people have been using it for serious real-time applications. But the hype Erlang is getting in the last couple of years is mostly for wrong reasons.

Having said that, its a language every programmer should look into at least once – its refreshing. And it might just be the perfect language that fits your needs.

What doesn’t Erlang have? It has its own Virtual Machine that it runs on. It is ridiculously simple to write distributed applications, with its message-passing style concurrency. It has its own database. It supports hot code swapping without even restarting the servers, and its already been proven in some telecom applications that need very high uptime. Hot code swapping is still a nightmare even in a supposedly mature language like Java which is targeted at building web applications – that have high availability and scalability.

However there are certain things that either suck or are not as convincing about Erlang.

1. Today’s mainstream developers who are used to C or Java like syntax wont find its Prolog-like syntax too friendly. Unless you started programming since the 80′s & 90′s and are used to languages of similar syntax, it will take quite some time before you get comfortable with Erlang’s weird syntax. I never felt too comfortable with the syntax.

2. While the core language itself is small and easy to learn, the libraries within the language are inconsistent, incomplete and poorly documented. I posted a couple of questions in the forums regarding how to use a library, and usually the answers would be “Don’t use that library, use the other one”. (Oh yeah, kind of like the Java JDK Logging. Please use Commons Logging.)

3. Only a few people have written production level codes and you rarely get to hear from them. All you hear from is Erlang enthusiasts, who are hyping it as the next big thing, but haven’t done more than a few labs from Armstrong book.

4. I can’t imagine how you can organize large code-bases in Erlang or even work as team, and this doesn’t feel right to any OO programmer.

5. Most of the performance matrices are one-sided, and are performed by people who have an interest in Erlang. I would love to see some independent analysis.

6. Its support for web-development is very primitive. With web frameworks like rails and grails, there is a lot of serious work for Erlang if it ever intends to go to that market.

7. Did I talk about Strings in Erlang? IO speed?

I know weaknesses aren’t as important as the strengths of a language. Erlang has it own expertise, its syntax structure, and its own audience. But the flaws of Erlang might just turn away a new programmer, even before he gets to its beauty.

If you are writing a web crawler, Erlang may very well be your choice. If you want to write a client-server, where the client makes a large no of requests, and you want to spawn concurrent processes to process the requests, Erlang could be your choice. If you want to write a Distributed Hash Table, Erlang could be your choice. Or if you are writing a video streaming server or doing system integration or writing any system utility. But a regular developer (building a CRUD application on top of a database, right? ) doesn’t have much to do with Erlang as yet. Secondly, even if you are working on those highly scalable, reliable and concrurrent systems, people have a hard time accepting Erlang along with its flaws.

The industry has a definite space for Erlang, currently and more so in future as we deal with more and more users, more data, and more forms of distribution. If not for Erlang exactly, then for an improved version of Erlang. It isn’t here to be the next Java, but to solve out the problems that Java couldn’t do smoothly in over a decade (despite having such a great community).

Erlang is going from an underestimated to an overhyped language. I wish it can convert the hype and raw interest in Erlang into something meaningful. How about a modern variation of Erlang on the Erlang’s virtual machine. Is it too late?

MD5 in Erlang

September 10, 2008

An MD5 hash for any given message is 16 bytes (128 bits) in size and is represented by a unique 32 digit Hexadecimal number.

Erlang has a built-in-function to calculate the MD5, which returns the hash in the form of a binary data-structure.

$ erlang:md5("hello").
<93,65,64,42,188,75,42,118,185,113,157,145,16,23,197,146>

But what we usually need is a string representation of the Hexadecimal value. Hence, we need to convert the Erlang Binary to a Hex-string, for which I didn’t find any BIF.

So first of all, lets convert the Binary to a list (of integers)- so that we can process it easily. Erlang has a function for converting binary to list:

$ binary_to_list(<93,65,64,42,188,75,42,118,185,113,157,145,16,23,197,146>).
[93,65,64,42,188,75,42,118,185,113,157,145,16,23,197,146]

Now, we have to convert each of the integers in the list into its hex equivalent. How do you convert an integer in Decimal system to a Hexadecimal system?

Eg. Take an integer 230. Divide it by 16.
230 div 16 = 14 In Hex, 14 is E
230 rem 16 = 6 In Hex, 6 is 6
So 230 in Hex is E6.

Now we will have to do the same for every integer in the list. I did it using the lists:map function and the applying the int_to_hex conversion to every integer:

$ lists:map(fun(X) ->
int_to_hex(X) end, L).

This will actually return a list of integers representing the (hex) string, which is how a string is represented in Erlang – a list of Integers.

The complete code is below:

-module(md5).
-export([md5_hex/1]).

md5_hex(S) ->
       Md5_bin =  erlang:md5(S),
       Md5_list = binary_to_list(Md5_bin),
       lists:flatten(list_to_hex(Md5_list)).

list_to_hex(L) ->
       lists:map(fun(X) -> int_to_hex(X) end, L).

int_to_hex(N) when N < 256 ->
       [hex(N div 16), hex(N rem 16)].

hex(N) when N < 10 ->
       $0+N;
hex(N) when N >= 10, N < 16 ->
       $a + (N-10).

Output:

$ md5:md5_hex("hello").
"5d41402abc4b2a76b9719d911017c592"

Erlang mode on Emacs

August 12, 2008

Although there are a few other editors for Erlang, I prefer to use Emacs for Erlang and its the only major reason I use Emacs for. Erlang now has an eclipse plugin too, called Erlide.

Erlang comes with the emacs mode as part of its standard distribution, so you only need to customize your emacs settings to use the erlang mode. Once you have Emacs installed (I have Carbon Emacs on my Mac OSX), create a .emacs file in your home directory (or use the one that you already have).

$ vi ~/.emacs

Then insert the following lines of Lisp code into your .emacs file.

;Erlang Mode
(setq load-path (cons  "/usr/local/lib/erlang/lib/tools-2.6.1/emacs" load-path))
(setq erlang-root-dir "/usr/local/lib/erlang")
(setq exec-path (cons "/usr/local/lib/erlang/bin" exec-path))
(require 'erlang-start)

/usr/local/lib is where my Erlang is installed, and /usr/local/lib/erlang/lib/tools-2.6.1/emacs is the location where erlang.el and erlang-start.el files are, which actually define and initialize the Emacs Erlang-mode.

Update the path and version of the tools in the above code as per your installation, and enjoy the amazing features – including the Erlang shell right from Emacs.