Click here to Skip to main content
15,886,788 members
Articles / Programming Languages / C#

Rules Based Torrent Downloader

Rate me:
Please Sign up or sign in to vote.
4.92/5 (11 votes)
9 Jun 2016GPL311 min read 15.9K   255   15   2
Fetches torrent files from online sources based on rules

Image 1

Background

Image 2  LEGAL DISCLAIMER  
Intellectual property theft (piracy) is an issue that affects all of us. When individuals obtain copies of data files containing copyrighted material, the content rights owner is deprived of the revenue necessary to continue the creation of their content. Intellectual property is protected by international law and penalties for infringement can be substantial. What one jurisdiction labels 'fair use', another might label as piracy so abundant caution should be exercised prior to downloading files on the Internet since doing so might expose you to personal liability for damages.

The year was 2004; Apple's biggest product, the iPod was under attack by Dell and other competitors trying to edge into the portable MP3 player retail space. Nokia executives confidently bragged that two-thirds of its consumer mobile phones would contain camera like functionality "really soon." Some enterprising students at MIT were experimenting with a truly "crazy" idea of adding short video clips to their web-logs (blogs). They even started referring to these diary entries as "video blogs," or VLogs for short. Laptop computers had recently dropped below the $800 price point the Christmas prior and market experts annointed the personal desktop computer as an antiquated dinosaur which could never survive the onslaught of sub $1000 laptops.

It was in this climate of technological upheaval that I happened to stumble upon a truly revolutionary piece of technology --the Tivo. The Tivo itself wasn't new to 2004, it had been available for a few years at this point. What was new was the ability to modify (e.g. hack) it. Some devoted enthusiasts over at the DealDatabase forums had managed to modify the Tivo operating system to do some fairly remarkable things, such as stream recorded shows over a LAN/WAN, display caller ID information on your TV, or even modify station iconography, images, and logos with custom graphics. The critical discovery that led to all the amazing extensions for the Tivo system came when enthusiasts identified the specific daemon responsible for initializing the encrypted file system. By configuring the TivoLinux rc init scripts to bypass that service iniatialization logic, modders were able to get the Tivo operating system to boot up an unencrypted, readable file system.

 

Image 3
Fig. 1: Adding a larger hard drive into a Tivo.

 

It's important to understand that this hack, and the homebrew that came along over the glory years of the Tivo, did not grant illegal access to the DirecTV or Tivo services. These modifications dramatically altered the capabilities of the Tivo but they were solely dependant on the end user owning the legal base services.

 

Image 4
Fig. 2: Transferring TV show recordings from Tivo to PC.

 

The utilies and features that came out of the TiVo homebrew scene completely transformed how many of us consumed television programming. Around the same time the InstantCake team started releasing simple, user friendly installation Zipper image files for Tivo, I was a traveling IT consultant. Being on the road 6 days a week, opportunities for me to enjoy television programming were sparse. For me personally, these exciting homebrew features had an immediate impact. On the one day I was at home (Sunday) each week I would spend most of that day downloading my Tivo recordings onto my laptop in preparation for the week ahead. Rather than reading stacks of mindless Newsweek or Maxim magazines during long communtes across the country I now had the ability to watch all of my TiVo recordings on the plane, or while trapped in some airport terminal.

The programming package I purchased from my television service provider no longer held any intrinsic dependency on my physical television set. I was free to watch my recorded programs anywhere in the world, at any time, without any limits, restrictions, or constraints. Its difficult to communicate just how profound a concept this was in 2004. For my entire life, up to that point, TV programming was static, it was singular, it was as much a part of the physical structure of my home as the drywall or furniture. Suddenly, seemingly overnight, it had been liberated.

What's the point of the history lesson gramps?

The world continues to spin, innovations abound, and life moves on. It's been twelve years since I first witnessed the amazing breakthroughs being released by the Tivo homebrew community. I haven't seen a Tivo in ages and have no idea what the state of its scene is today (if it exists at all anymore). In all the time that has passed since then, I never forgot those few years I had with that TiVo unit or how remarkable the freedom it provided was.

This project aims to restore some of that freedom. Rather than opening a consumer electronics device and modifying it to what we want, we will instead leverage the public Internet and the bit-torrent protocol to provide similar functionality to a modified DVR. This application isn't going to recommend shows to you or learn your specific interests based on thumbs up or thumbs down ratings but it should provide a means to automatically acquire all your shows for you.

Image 5  FAIR USE DOCTORINE  
In its most general sense, 'fair use' is any copying of copyrighted material done for a limited and "transformative" purpose, such as to comment upon, criticize, or parody a copyrighted work. It is debatable whether or not obtaining a television program from the Internet that you have legal access to via a cable/satellite subscription is "transformative". Even if this is considered fair use, not everyone is covered by the fair use doctorine and in some regions fair use isn't even clearly, legally defined.

Custom PVR Downloader

This solution is a simple rules based download engine. It requires a rule-set for defining what actions to take when executed. The ruleset contains the relevant information regarding where to fetch the programming indices, which expressions to apply to which indices, and what actions to take when the downloader identifies matches.

 

Image 6
Fig. 3: Basic PVR downloader design (rss-mode).

 

The Rule-Set (Rules.xml)

The heart of the solution is the rules engine. This is a file that contains directives relating to how the program should function. Below is an example of a basic rule-set, which consumes two program index sources.

XML
<?xml version="1.0" encoding="utf-8"?>
<ruleset revision="1.2" mode="rss">
  <!-- IGN Podcasts -->
  <rules downloadPath="\\baileyfs01\Transmission\Podcasts\_converted" extensions="*.mp3;*.mp4" feedId="177" rssUrl="http://feeds.ign.com/ignfeeds/podcasts/games?format=xml">
    <expression limit="1" enabled="true"><![CDATA[Game.Scoop]]></expression>
  </rules>
  <!-- Kickass Torrents: Television -->
  <rules downloadPath="\\baileyfs01\Transmission\_torrent_watch" extensions="*.torrent" feedId="175" rssUrl="https://kat.cr/tv/?rss=1">
    <expression limit="1" enabled="true"><![CDATA[^BBC.Horizon.]]></expression>
    <expression limit="1" enabled="true"><![CDATA[^ABC.World.News.Tonight.]]></expression>
    <expression limit="1" enabled="true"><![CDATA[^CBS.Evening.News.]]></expression>
    <expression limit="1" enabled="true"><![CDATA[^NBC.Nightly.News.]]></expression>
  </rules>
  <history/>
</ruleset>

ruleset Element attributes:

  • revision: The rules engine version in use.
  • mode: The index source (either rss or database)

rules Element attributes:

  • downloadPath: The path to download files.
  • extensions: Semi-colon seperated list of file extension contraints.
  • feedId: Only applicable in database mode. The FeedID in the Feed database table.
  • rssUrl: Only applicable in rss mode. The url of the Rss Feed.
  • enabled: (Optional). Enable/Disable all expressions in the ruleset (true if not specified).

rules.expression Element attributes:

  • limit: The maximum matches to download (0 is no limit).
  • enabled: The expression should be applied or ignored.

rules.expression Element Value:
The value of each expression contains a regular expression pattern used to identify matches in the defined program index.

Running the Downloader

Executing the assembly will result in the application displaying the help screen. This solution is designed to execute as a scheduled task, at pre-determined intervals (once an hour for example).

Below are the switches available:

Syntax:
Baileysoft.Rss.Downloader.exe [/t] | [/e] | [/?] | [?]
    /t= Test run only. Apply the ruleset but don't download any files
    /e= Execute. Execute the application with full functionality

Examples:
Baileysoft.Rss.Downloader.exe /e

Prior to deploying the downloader to a server though you will want to build and test your regular expressions. To do this I have provided a batch file I use to test my expressions. The script is called Baileysoft.Rss.Downloader.Options.cmd. Executing the script will provide an interactive user interface designed to simplify interactivity with the application.

Running the Expression tester

Baileysoft.Rss.Downloader.Options.cmd

Baileysoft.Rss.Downloader.exe - nealbailey@hotmail.com
Copyright Baileysoft Solutions 2002-2016

 1: Execute a test run
 2: Execute a full run
 3: Open the log file
 4: Delete all logs

Make a selection [1/2/3/4]>

Selecting the 1 option will result in a test-run and then open the logfile for analysis:

2016-04-05 11:18:49,147 [1] | INFO  | Baileysoft.Rss.Downloader.exe started
2016-04-05 11:18:49,164 [1] | INFO  | This is a test-run. No files will be downloaded.
2016-04-05 11:18:49,165 [1] | DEBUG | Began Parsing ruleset file into memory.
2016-04-05 11:18:49,184 [1] | DEBUG | Successfully parsed the ruleset file from disk.
2016-04-11 11:18:49,257 [1] | INFO  | Fetching nodes for Feed 'http://feeds.ign.com/ignfeeds/podcasts/games?format=xml'
2016-04-11 11:18:49,982 [1] | DEBUG | Retrieved '200' nodes from Feed 'http://feeds.ign.com/ignfeeds/podcasts/games?format=xml'
2016-04-11 11:18:49,984 [1] | DEBUG | Applying rules to the identified nodes.
2016-04-11 11:18:49,985 [1] | INFO  | Found '1' nodes that match the ruleset file restrictions.
2016-04-11 11:18:49,987 [1] | DEBUG | Match: Game Scoop! : Game Scoop! Episode 385
2016-04-11 11:18:49,988 [1] | INFO  | Downloading file: 'Game Scoop! : Game Scoop! Episode 385' | url 'http://feeds.ign.com/~r/ignfeeds/podcasts/games/~5/dRcIzJLE-8o/Game_Scoop_Episode_385.mp3'
2016-04-11 11:18:49,991 [1] | INFO  | Fetching nodes for Feed 'https://kat.cr/tv/?rss=1'
2016-04-05 11:18:50,431 [1] | DEBUG | Retrieved '50' nodes from Feed 'https://kat.cr/tv/?rss=1'
2016-04-05 11:18:50,433 [1] | DEBUG | Applying rules to the identified nodes.
2016-04-05 11:18:50,436 [1] | INFO  | Found '2' nodes that match the ruleset file restrictions.
2016-04-05 11:18:50,438 [1] | DEBUG | Skipping 'CBS.Evening.News.2016.04.04.(Eng.Subs).SDTV.x264-[2Maverick]' : it has been previously downloaded.
2016-04-05 11:18:50,439 [1] | DEBUG | Skipping 'NBC.Nightly.News.2016.04.04.WEB-DL.x264-[2Maverick]' : it has been previously downloaded.
2016-04-05 11:18:50,926 [1] | INFO  | Baileysoft.Rss.Downloader.exe completed execution

It's quite common for multiple versions of a program to get released, it makes sense to perform test runs frequently and refine your expressions over time to ensure you download the highest quality (or lowest quality) releases.

Downloading Podcasts

Being a batch download application, it can work with any type of file you wish to download on a schedule; assuming (of course) there is an index (either RSS or SQL database) of available files.

Imagine you follow content produced by IGN.com and you would like to automatically download each new episode of the Game Scoop show whenever it becomes avaialable. The IGN podcast index includes all of IGNs shows but we are only interested the Game Scoop show.

To accomplish this we need to create some rules in our rule-set to perform this action. Below are some rules to enable this functionality.

<!-- IGN Podcasts -->
<rules downloadPath="C:\Users\Shadow\Downloads" extensions="*.mp3;*.mp4" feedId="177" rssUrl="http://feeds.ign.com/ignfeeds/podcasts/games?format=xml">
  <expression limit="1" enabled="true"><![CDATA[Game.Scoop]]></expression>
</rules>

In a nutshell, the above rules mean the following:

  1. Get the RSS release index from http://feeds.ign.com/ignfeeds/podcasts/games?format=xml
  2. Find the first release that matches the regular expression 'Game.Scoop'
  3. Download the first release match (if its an MP3 or MP4 file) to the folder: C:\Users\Shadow\Downloads

The feedId value can be any integer (Rss mode doesn't use this value).

Downloading Torrents

A torrent file is a metadata file that explains to BitTorrent clients how to download the requested data. They don't actually contain the contents themselves. You need a torrent client installed (and configured to watch for new torrent files arriving into a directory it has been configured to watch) if you would like the torrent files to be automatically processed.

There are dozens of torrent clients available. I'm using the transmission client on a Linux NAS server (this is why my rules are set to a UNC share). A popular Windows based torrent client is uTorrent and below is a screenshot of how to configure utorrent to watch a directory.

You need to ensure that your rules point to the directory defined in your torrent client settings.

XML
<?xml version="1.0" encoding="utf-8"?>
<ruleset revision="1.2" mode="rss">
  <!-- Kickass Torrents: Television -->
  <rules downloadPath="C:\Users\Shadow\uT\torrents\new" extensions="*.torrent" feedId="175" rssUrl="https://kat.cr/tv/?rss=1">
    <expression limit="1" enabled="true"><![CDATA[60.Minutes]]></expression>
  </rules>
  <history/>
</ruleset>

 

Image 7
Fig. 4: Configuring torrent client to pick up and process downloaded torrent files.

 

 

Image 8
Fig. 5: Transmission-Web processing the downloaded torrent files.

 

Downloading based on a schedule

Most RSS index sources only provide 25-50 records at a time and they turn over constantly. If you run this downloader interactively once or twice a day then the effectiveness of the solution is significantly impacted. The ideal scenario is one where a Scheduled Task is configured to execute every 60 minutes or so.

 

Image 9
Fig. 6: Scheduled Task in Windows 2008 Server.

 

Moving the Rules File

By default the downloader looks for the rule-set definition (Rules.xml) in the same directory that the application is executing from. For various reasons you may like the rule-set definition to exist in an alternate location. You can edit the rule-set file path in the Baileysoft.Rss.Downloader.exe.config file:

<!-- The path to Rules.xml (empty for working directory) -->
<add key="RULES_FILE_PATH" value="C:\Users\developer\AppData\Roaming\Baileysoft\Baileysoft.Rss.Downloader\Rules.xml"/>

Dealing with HTTP404 Errors

If you have any question regarding whether or not the content being downloaded is copyrighted material, look for the following message in the log file:

2016-04-06 09:05:07,063 [1] | ERROR | An error occurred downloading the file: The remote server returned an error: (404) Not Found.

Most torrent file distributors provide an automated copyright strike submission system. When content owners identify infringing links on various sites, the torrent files indexing the questionable material are reported and then swiftly taken down. This cat and mouse loop has created a sort of race condition between content owners and downloaders. Most downloaders simply decrease the check threshold so they get the material as soon as possible, before its taken down by a copyright strike.

 

Image 10
Fig. 7: Infringing material is flagged and removed from the torrent depot.

 

Image 11  COPYRIGHT STRIKE  
Some torrent file distributors are hostile towards international copyright law and continue providing files with no regard whatsoever for the law, even in the face of legal action. For example, when torrents hosted by the torcache file server are taken down, kickass torrents will automatically update their magnet links so users who can no longer download the torrent file can still get the infringing material. These web sites are bad actors in the world community. If you see a 404 error in the downloader log file then you know the content is infringing and as been explicitely taken down. Persisting forward with magnet links or other methods of download is an obvious demonstration on your part that you have no regard for legal copyright.

Conclusion

While my Downloader certainly isn't a TiVo, it does an admirable job at automatically fetching my DVR subscriptions so I may watch them anywhere (rather than only on my television).

I've had a chance to play around with several downloaders and pretty much all of them are better than this one. My intent for this project was to create something that met my specific needs and goals.

Alternative Downloaders

History

  • 2016-04-06: Article completed
  • 2016-06-09: Article submitted to CodeProject Editorial Staff

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)


Written By
Software Developer
United States United States
I'm a professional .NET software developer and proud military veteran. I've been in the software business for 20+ years now and if there's one lesson I have learned over the years, its that in this industry you have to be prepared to be humbled from time to time and never stop learning!

Comments and Discussions

 
QuestionExcellent Article - Question Regarding Task Scheduler Pin
polczym24-Nov-16 22:19
polczym24-Nov-16 22:19 
PraiseNICE WORK MAN Pin
DumpsterJuice10-Jun-16 10:23
DumpsterJuice10-Jun-16 10:23 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.