Working with FTP is simple enough. This is the first project I have attempted using Talend, to download CSV files from an FTP server over the internet, perform some data manipulation, and finally upload the modified files into another folder in the same FTP server.
The process of my Talend job is as follows:
- Using a tPreJob, I begin reading from a config file to obtain the key parameters for this project, and perform a tContextLoad.
- The main job starts with a tJava SubJob. It contains my custom java logic to redirect stdout and stderr for writing log files, set up any necessary global variables, or anything you want. It is flexible.
- The deactivated SubJob contains a tContextDump for me to read the context (or config in my own terms) that I have loaded for this job, and a tLogRow connected via Iterate to write into the stdout.
- Next, tFTPConnection connects to the FTP server and locate the desired folder, and tFTPGet downloads the files.
- tStatCatcher is used to read the statistic of any components that have checked the stat catcher option. (honestly I still have not figured out how to use this)
- Connecting tDie OnSubJobError to SubJobs will kill the main job when that particular Subjob fails, and throw out any error messages that you have set.
- tPostJob will be executed at the end, regardless of whether an error killed the Job or not. I used it to tFTPClose the FTP connection and clean up any streams that have been opened.
Works perfectly fine. However, I am uncertain if this is the best way to structure a job. Will continue to improve on it.
Here is part 2 of the job. This will upload the file back to the FTP server. I have split it up into two distinct jobs so that they can be reused for other projects independently. Reusability is also my top consideration when I decided to use context loading for these two jobs.