How SSL is failing…

Every web developer should know about the Secure Sockets Layer and it’s successor, the Transport Layer Security. (SSL and TLS.) These techniques are nowadays a requirement to keep the Internet secure and to keep private matters private. But while this technique is great, the implementation of it has some flaws.

Flaws that aren’t easy to resolve, though. But to understand, you will need to understand how this technique works when you use it to protect the communication between a visitor and a website.

This is a very long post as it is a complex topic and I just don’t want to split it in multiple posts. So I add several headers to separate parts…

Femke Wittemans ArmoredWhat is SSL?

When you communicate with a website, you’re basically have a line with two endpoints. You on one side and the web server on the other side. But this isn’t a direct line but goes from your computer to your router/modem, to your provider, over some other nodes on the Internet, to the provider of the host, the router of the host to the computer where the host is running the website on. And this is actually a very simplified explanation!

But it shows a problem. The communication between the two endpoints goes over various node and at every node, someone might be monitoring the communication, listening for sensitive information like usernames, passwords, social security numbers, credit card numbers and whole lot more. And to make it even more harder, you don’t know if your information will travel over nodes that you can trust! So, you need to make sure that the information you send over between both endpoints is secure. And that’s where SSL is used.

SSL is an encryption technique with asynchronous keys. That means that the host has a private key and a public key. It keeps the private key hidden while giving everyone the public key. You can generally use one key to encrypt a message but you would need the other key to decrypt it again. And that’s basically how security works! You have the public key and you encrypt your data with it before sending it to me. No one on those nodes have the private key, except me, so I’m the only one who can decrypt it. I can then encrypt a response with the private key and send that back to you, as you can decrypt it again with the public key.

Unfortunately, everyone on those nodes can too, as they would all know this public key as I had just sent it to you. So things need to be a little more secure here. Which fortunately is the case. But in general, you should never encrypt sensitive data with a private key as anyone with the public key would be able to read it!

Session keys…

But the Internet uses a better trick. Once you receive the public key from the web server, your browser will generate a session key, which is a synchronous encryption key. This session key can be used by anyone who knows it and right now, you would be the only one. But as you have my public key, you would use my public key to encrypt the session key and send it to me. Now I know it too and we can have a communication session using this session key for security. And once the session is done, the session key can be discarded as you would make a new one for every session.

Most people think the encryption is done using the SSL certificates but that’s not the case! SSL is only used to send a session key to the server so both can communicate safely. That is, as long as no one else knows that session key. Fortunately, session keys are short-lived so there isn’t much time for them to fall in the wrong hands…

The main weakness…

So, this seems pretty secure, right? My site sends you a public key, you create a session key and then we communicate using this session key and no one who is listening in on us can know what we are talking about! And for me it would be real easy to make the SSL certificate that would be used as most web servers can do this without any costs, and generally within just a few minutes. So, what can go wrong?

Well, the problem is called the “Man in the Middle” attack and this means that one of the nodes on the line will listen in on the communication and will intercept the request for secure communications! It will notice that you ask for a public key so it will give its own public key to you instead that of the host. It also asks for the public key of the host so it can relay all communications. You would then basically set up a secure line with the node and the node does the same with the host and will be able to listen in to anything that moves between you, as it has to decrypt and then encrypt each and every message. So, it can listen to sensitive data without you realizing that this is happening!

Authorities…

So the problem is that you need to know that the public key I gave you is my public key, and not the key of this node. How do you know for sure that this is my key? Well, this is where the Certificate Authorities (CA) have a role.

The CA has a simple role of validating the certificates that I use for my website. I want to secure my host so I make a certificate. I then ask the CA to sign this certificate for me. The CA then checks if I really am the owner of the domain at that specific moment or have at least some control over the domain. And when they believe that I’m the owner, they will sign my certificate.

Then, when you receive my public key then you can check the credentials of this certificate. It should tell you if it is for the specific domain that you’re visiting and it should be signed by the CA who issued the certificate. If it is signed correctly then the CA will have confirmed that this certificate is linked to my host and not that of some node between you and me.

Trust…

But the problem is that when your connection isn’t secure because some node is trying to listen than checking if my certificate is properly signed by the CA won’t work, as you would be requesting the CA to validate it over the same unsafe connection. The node will just claim it is so that option won’t work. No, you already need to know the public key of the CA on your system so you can decrypt my signature. And you need to have received the Ca’s certificate from some secure location. Otherwise, you can’t trust if my public key is the real thing.

So most web browsers have a list of public keys as part of their setup. When they install themselves, they will also include several trustworthy public keys from the more popular CA’s. Basically, the CA’s they deem reliable. So your browser will validate any public key from my site with the public keys it knows and trusts and if everything is okay, you will be notified that the connection is secure and the session key can be generated for further secure communications.

Otherwise, your browser will give you a warning telling you what’s wrong with the certificate. For example, it might be outdated or not meant for the specific domain. In general, you should not continue as the connection with the host has security problems!

Distrust!…

But here’s the thing… The list of trusted CA’s in your browser can be modified and to be honest, it sometimes gets modified for various reasons. Some are legitimate, others are not.

For example, this list is modified when a specific CA is deemed unreliable. This happens regularly with smaller CA’s but once in a while, some major scandal happens. For example, in 2011 it was discovered that the company DigiNotar had a security breach which had resulted in several certificates being falsified. Most likely, the Iranian Government was behind this in an attempt to check all emails that their citizens were sending through GMail. The fake certificates allowed them to listen in on all emails using the man in the middle technique. DigiNotar went bankrupt shortly afterwards, as all the certificates they had issued had become worthless.

Similar problems occurred at StartCom, a CA that actually gave away free certificates. The Israeli company was purchased by a Chinese company and some suspicious behavior happened soon afterwards. The fear was that this Chinese company (and perhaps even the Chinese government) would use this trust that StartCom had to make fake certificates to listen in on all communications in China. Both Mozilla and Google started to raise questions about this and didn’t get satisfying answers so they decided to drop the StartCom certificates. This CA had become controversial.

And then there’s Symantec. Symantec is a company that has been making software for decades that all relate to security. It is an American company and has been trustworthy for a long time. And in 2010 Symantec acquired Verisign’s authentication business unit which includes releasing SSL certificates for websites. But in 2015 it was discovered by Google that Symantec had issued several test certificates for impersonating Google and Opera. Further research has led Google to believe that Symantec has been publishing questionable certificates for over 7 years now and thus they announced that they will distrust Symantec SSL certificates in the near future. In April 2018, all Symantec certificates will be useless in Google Chrome and other browsers might follow soon.

Also interesting is that Symantec is selling their SSL business to DigiCert. This could solve the problem as DigiCert is still trusted. Or it makes things worse when browser manufacturers decide to distrust DigiCert from now on also.

But also: Telecom!

But there are more risks! Many people nowadays have mobile devices like tablets and phones. These devices are often included with a subscription to services of some mobile phone company. (T-Mobile and Vodafone, for example.) These companies also sell mobile devices to their customers and have even provided “free phones” to new subscriptions.

However, these companies will either provide you with a phone that has some of their software pre-installed on your new device or will encourage you to install their software to make better use of their services. The manufacturers of these mobile devices will generally do similar things if given a chance. And part of these additions they make to your Android or IOS device is to include their own root certificates with the others. This means that they are considered trustworthy by the browser on your device.

Is that bad? Actually, it is as it allows these companies to also do a man in the Middle attack on you. Your telecom provider and the manufacturer of your phone would be able to listen to all your data that you’re sending and receiving! This is worse, as local government might require these companies to listen in on your connection. It is basically a backdoor to your device and you should wonder why you would need to trust your provider directly. After all, your provider is just another node in your connection to the host.

Did you check the certificate?

So the problem with SSL is that it’s as reliable as the Certificate Authorities who made those certificates. It’s even worse if your device has been in the hands of someone who wants to listen in on your secure connections as they could install a custom trusted certificate. Then again, even some malware could install extra public keys in your trusted certificates list without you noticing. So while it is difficult to listen to your secure conversations, it is not impossible.

You should make a habit of checking any new SSL certificate that you see pop up in your browser and it would be a good idea if browsers would first ask you to check a certificate whenever they detect that a site has a new one. It would then be up to the user to decide to trust the certificate or not. And those certificates would then be stored so the browser doesn’t need to ask again.

Unfortunately, that would mean that you get this question to trust a certificate very often when you’re browsing various different sites and users will tend to just click ‘Ok’ to get rid of the question. So, that’s not a very good idea. Most users aren’t even able to know if a certificate is trustworthy or not!

For example, while you’re reading this blog entry, you might not have noticed that this page is also a secured page! But did you check the certificate? Did you notice that it is signed by Automattic and not by me? That it’s a certificate issued by “Let’s Encrypt“? Well, if the certificate is showing this then it should be the right one. (But when my host Automattic changes to another CA, this statement becomes invalid.)

ACME protocol…

And here things become interesting again. “Let’s Encrypt” gives away free certificates but they are only valid for a short time. This makes sense as domains can be transferred to other owners and you want to invalidate the old owner’s certificates as soon as possible. It uses a protocol called ACME which stands for “Automatic Certificate Management Environment. It’s basically a toolkit that will automate the generation of certificates for domains so even though the certificates are only valid for a short moment, they will be replaced regularly. This is a pretty secure setup, although you’d still have to trust this CA.

Problem is that “Let’s Encrypt” seems to prefer Linux over Windows as there is almost no good information available on how to use ACME on Windows in IIS. But another problem is that this protocol is still under development and thus still has some possible vulnerabilities. Besides, it is very complex, making it useless for less technical developers. The whole usage of certificates is already complex and ACME doesn’t make things easier to understand.

Also troublesome is that I tried to download the ACME client “Certify the Web” for Windows but my virus scanner blocked the download. So, now I have to ask myself if I still trust this download. I decided that it was better not to trust them, especially as I am trying to be secure. Too bad as it seemed to have a complete GUI which would have made things quite easy.

Don’t ignore security warnings! Not even when a site tells you to ignore them…

Additional problems?

Another problem with SSL is that it is an expensive solution so it is disliked by many companies who are hosting websites. It’s not the cost for the certificates, though. It’s the costs for hiring an expert on this matter and making sure they stay with the company! A minor issue is that these security specialists do have access to very sensitive material for companies so you need to be sure you can trust the employee.

Of course, for very small companies and developers who also host websites as a hobby, using SSL makes things a bit more expensive as the costs are generally per domain or sub domain. So if you have three domains and 5 subdomains then you need to purchase 8 certificates! That’s going to easily cost hundreds of euros per year. (You could use a Multi-Domain (SAN) Certificate but that will cost about €200 or more per year.)

Plus, there’s the risk that your CA does something stupid and becomes distrusted. That generally means they will have to leave the business and that the certificates you own are now worthless. Good luck trying to get a refund…

But another problem is that the whole Internet is slowly moving away from insecure connections (HTTP) to secure (HTTPS) connections, forcing everyone to start using SSL. Which is a problem as it starts to become a very profitable business and more and more malicious people are trying to fool people into buying fake or useless certificates or keep copies of the private key so they can keep listening to any communications done with their keys. This security business has become highly profitable!

So, alternatives?

I don’t know if there are better solutions. The problem is very simple: Man in the Middle. And the biggest problem with MITM is that he can intercept all communications so you need something on your local system that you already can trust. Several CA’s have already been proven untrustworthy so who do you trust? How do you make sure that you can communicate with my server without any problems?

There is the Domain Name Service but again, as the MITM is intercepting all your transactions, they can also listen in on DNS requests and provide false information. So if a public key would be stored within the DNS system, a node can just fake this when you request for it. The MITM would again succeed as you would not be able to detect the difference.

Blockchain?

So maybe some kind of blockchain technology? Blockchains have proven reliable with the Bitcoin technology as the only reason why people have lost bitcoins is because they were careless with the storage of their coins. Not because the technique itself was hacked. And as a peer-to-peer system you would not need a central authority. You just need to keep the blocks on your system updated at all times.

As we want the Internet to be decentralized, this technology would be the best option to do so. But if we want to move security into blockchains then we might have to move the whole DNS system into a blockchain. This would be interesting as all transactions in blockchains are basically permanent so you don’t have to pay a yearly fee to keep your domain registered.

But this is likely to fail as Bitcoin has also shown. Some people have lost their bitcoins because their disks crashed and they forgot to make a copy of their coins. Also, the file size of the Bitcoin blockchain has grown to 100 GB of data in 2017 which is quite huge. The whole DNS system is much bigger than Bitcoin is so it would quickly have various problems. Your browser would need to synchronize their data with the network which would take longer and longer as the amount of data grows every day.

So, no. That’s not really a good option. Even though a decentralized system sounds good.

Conclusion?

So, SSL has flaws. Then again, every security system will have flaws. The main flaw in SSL is that you need to trust others. And that has already proven to be a problem. You have to be ultra-paranoid to want to avoid all risks and a few people are this paranoid.

Richard Stallman, for example, is a great expert on software yet he doesn’t use a mobile phone and avoids using the Internet directly. Mobile phones are “portable surveillance and tracking devices” so he won’t use them. He also avoids key cards and other items that allow people to track wherever he goes. And he generally doesn’t access the Web directly, as this too would allow people to track what he’s doing. (He does use Tor, though.) And maybe he’s on to something. Maybe we are putting ourselves in danger with all this online stuff and various devices that we have on our wrists, in our pockets and at our homes.

Thing is that there is no alternative for SSL at this moment so being paranoid is useful to protect yourself. Everyone should be aware of the risks they take when they visit the Internet.  This also depends on how important or wealthy you are, as poor, boring people are generally not interesting for malicious people. There isn’t much to gain from people with no money.

Still, people are still too careless when they’re online. And SSL isn’t as secure as most people think, as events from the past have already proven…

 

How Yahoo has failed.

As many people have already read, Yahoo had a severe data leak in the past which resulted in ALL YAHOO ACCOUNTS being leaked to hackers. The hack includes sensitive personal information and includes an MD5 hash of the password you’ve used with Yahoo. This is a very serious issue as Yahoo has told me today in an email. It says:

Yahoo
UPDATED NOTICE OF DATA BREACH
Dear Yahoo User,
We are writing to update you about a data security issue Yahoo previously announced in December 2016. Yahoo already took certain actions in 2016, described below, to help secure your account in connection with this issue.
What Happened?On December 14, 2016, Yahoo announced that, based on its analysis of data files provided by law enforcement, the company believed that an unauthorized party stole data associated with certain user accounts in August 2013. Yahoo notified the users it had identified at that time as potentially affected. We recently obtained additional information and, after analyzing it with the assistance of outside forensic experts, we have determined that your user account information also was likely affected.
What Information Was Involved?

The stolen user account information may have included names, email addresses, telephone numbers, dates of birth, hashed passwords (using MD5) and, in some cases, encrypted or unencrypted security questions and answers. Not all of these data elements may have been present for your account. The investigation indicates that the information that was stolen did not include passwords in clear text, payment card data, or bank account information. Payment card data and bank account information are not stored in the system we believe was affected.
What We Are Doing

In connection with the December 2016 announcement, Yahoo took action to protect users (including you) beyond those identified at that time as potentially affected. Specifically:

  • Yahoo required potentially affected users to change their passwords.

  • Yahoo also required all other users who had not changed their passwords since the time of the theft to do so.

  • Yahoo invalidated unencrypted security questions and answers so they cannot be used to access an account.

We are closely coordinating with law enforcement on this matter, and continue to enhance our systems that detect and prevent unauthorized access to user accounts.

What You Can Do

While Yahoo already has taken action to help secure your account, we encourage you to consider the following account security recommendations:

  • Change your passwords and security questions and answers for any other accounts on which you used the same or similar information used for your Yahoo account.

  • Review your accounts for suspicious activity.

  • Be cautious of any unsolicited communications that ask for your personal information or refer you to a web page asking for personal information.

  • Avoid clicking on links or downloading attachments from suspicious emails.

Additionally, please consider using Yahoo Account Key, a simple authentication tool that eliminates the need to use a password on Yahoo altogether.
For More Information

For more information about this issue and our security resources, please visit the Yahoo 2013 Account Security Update FAQs page available at https://yahoo.com/security-update.

We value the trust our users place in us, and the security of our users remains a top priority.

Sincerely,
Chris Nims
Chief Information Security Officer

And yes, that’s bad… it’s even worse as the hack occurred in 2013 and it has taken Yahoo 4 years to confess everything about the hack. Well, everything? I’m still not sure if we’ve heard everything about this case. Worse, as Verizon recently took over Yahoo for a large sum of money, it could even have an impact for anyone using the Verizon services.

But there is more as people might not realise that the sites Tumblr and Flickr are also part of the Yahoo sites. We know that Yahoo is hacked but how about those other two sites? As I said, we might still not know everything…

Yahoo

About to drown by failing security.

Well, assume the worst. While we might be arming ourselves properly against any of these kinds of hacks, we also chain ourselves to the security provided by companies like Yahoo. And those security measures might not protect us against everything.

Fact is that Yahoo has become great and is becoming even bigger now they’re part of Verizon. As a result, all those 3 billion accounts are now owned by Verizon and we better hope that Verizon will use better security than Yahoo ever did. If not, anyone who ever used Yahoo, Tumblr, Flickr or Verizon might soon drown in security problems as their accounts have been hacked and they will continue to hack those.

Is there a solution to this problem? That’s a good question as there are many other companies that we rely upon for our security. Twitter, Google and Facebook are a few popular sites that are also popular targets for hackers. However, as long as these large corporations immediately notify all users if there’s a serious data breach and immediately respond by increasing security, the risks should be acceptable. What Yahoo did was wrong as it took 4 years before they finally admitted the truth!

So in my opinion, Yahoo has to disappear. It is unacceptable that any company with such a major role on the Internet regarding security is trying to hide the truth and keep people vulnerable instead of responding immediately. So instead of following Yahoo’s advise and change your password, I suggest everyone just close their Yahoo account. Permanently! You might still keep your Flickr and Tumblr account as those might not be involved in this hack but Yahoo should go.

And let’s hope that someone will improve the security on both Tumblr and Flickr as these services are highly popular all over the World.

Nieuwe ABN-AMRO phishing email!

(Dutch warning about a phishing email targeting ABN-AMRO customers. As it targets Dutch people, I write it in Dutch. Sorry…)

Vandaag weer een spam-bericht in mijn spambox ontvangen waarin men weer probeert om mensen op een link te laten klikken. Ik heb het maar meteen als “Phishing” aangemerkt maar het is een beetje onbegrijpelijk dat mensen hier soms toch intrappen want als je goed oplet zie je dat er niets van klopt!2017-06-16.png

Eerst en vooral komt de email binnen op een account die ik niet gebruik voor deze bank, hoewel ik er wel een account heb. Dit toont maar weer eens aan hoe praktisch het is om je eigen domeinnaam te hebben met een catch-all mailbox zodat je een oneindig aantal email adressen kunt aanmaken.

Andere waarschuwingen zijn de spaties in de datum, de titel “Trouwe Cliënt” en enkele andere taal- en stijlfouten in de tekst. Zo klinkt “betaal kaart” best raar als het om een betaalpas gaat. Duidelijk een gevalletje Google Translate.

Ook het verhaal erachter is vreemd want de bank heeft problemen in hun IT systemen en daardoor moet de klant opeens actie ondernemen? En zolang dat niet gebeurt is de account geblokkeerd?

Interessanter wordt het als je de bron van de email beter gaat controleren. De afzender maakt gebruik van een sub-domein van sodelor.eu en mogelijk is dit gehele domein een phishing-site. In ieder geval heeft het sub-domein een phishing pagina waarin het PayPal nabootst. Sowieso zou je PayPal als afzender verwachten, maar goed. Sommige mensen zijn idioten…

De email bevat ook een URL die verwijst naar een Russische website en dat verbaast mij niets. Russische domeinnamen worden vaak door hackers misbruikt omdat deze vaak eenvoudig te hacken zijn.

Als je verder de bron nakijkt zie je dat deze via de Duitse kundenserver.de worden verstuurd. Dit domein is ondertussen al op diverse blacklists geplaatst wegens de grote hoeveelheid spam die ermee wordt verzonden.

Maar goed, de meest duidelijke detectie dat dit spam is, is omdat het in mijn spam-folder zit.

Delicious spam!

Once more, a post about spam. Why? Because I have one more interesting email in my spam-box, sent by someone who clearly is confused by the whole topic. So, here’s the email, with some annotations:

spam-1486041897301

Why is it spam? Because Google Apps/GMail says it is. And google is often right in these things. And as I don’t know Adam Collier, nor see any name of his company, it clearly seems like spam to me too, from some wannabe web developer in India looking for customers without understanding the rules.

Why  from India? Well, the English writing is more British than American. The writing style is similar to how Indian spam is generally written, with only single-line paragraphs. The skill set used is also very common among Indian developers. The extreme politeness in the writing also is similar to what you see in mostly Asian countries, as people there are generally more polite. Then of course, it mentions India in the email too so that wasn’t difficult.

First of all, this email was sent from a genuine, free email address like those offered by Outlook, Gmail and Yahoo. I’m not going to say if it’s Outlook or not as I allow this guy some anonymity, even though his name is probably fake and the address already closed for sending spam. But for me that’s the first sign of spam. If it is sent from a free mail provider then you should make sure you know the sender before continuing! As usual, check the sender first for every email you receive!

Next is the address to where it was sent. While it seems to be my “info” account, it just isn’t! It was received by the account I used for my registrar and used in my domain registration where it is visible in the WhoIs information, including my name and some other details. The “info” address happens to be the address of some other website, who has also received this email. My address was actually part of the BCC header so other recipients would not see that I had received it. Smart, but it is to be expected from mass mailers as they would really piss off a lot of people if they only use the TO or CC fields, as many people tend to ‘Reply to all’ on spam messages, making even more spam.

So they got my address from the WhoIs database. So they should have known my name too! They just can’t use it as this is a mass email that’s probably sent to hundreds or even more people.As this spammer doesn’t seem to use any mass mailer application, I suspect that he just collected a lot of email addresses from interesting-looking domains and just mailed to them all from Outlook so the amount of recipients is likely to be hundreds, maybe thousands. Not the millions that more experienced spammers will use.

Interesting is how he’s called a webmanager in his email address while calling himself an online marketing manager in the email. No name for his business so maybe he doesn’t even have a real business. This could be a simple PHP developer who is trying to make a freelance web development business and is hoping to get some customers so he can expand his business. He might have a few friends who are also doing development and likely is a student at Computer Science classes in India who wants to put his lessons to the Test. This doesn’t look like a hardcore spammer, even though he is spamming. He’s more a lightweight spammer.

The prices he mentions are very reasonable. Then again, he basically uses standard frameworks like WordPress, Joomla, Magento and Drupal to build those sites which is generally not too much work. I call these “Do not expect too much from us” prices.

There is one major alert in all this, though. The grey line mentions a “Payment Gateway” which you should immediately distrust! Why? Because this developer is probably setting up this payment gateway and might have control over it later on. He could be siphoning off some of the payments made through it or even at one point empty all the money collected and put it in his own bank account! Good luck getting your money back!

Well, he could be honest but you should not take that risk to begin with…

It is interesting to see that he also provides Android and IOS applications. He seems to be specialized in PHP so he would need to know Swift or Objective-C to do the IOS development and Java for the Android development. Or have some other programming environment that allows him to develop for both platforms. He might be using Visual Studio with Xamarin which would allow him to focus on different platforms. Or he has friends who specialized in app development.

At the bottom of his email he tells you that this isn’t spam and that he actually hates spam. So if you aren’t interested you should just reply to him so he can confirm that your email address exists and is in use so he won’t be sending emails to it. Wait… Why does he need that? People who aren’t interested generally won’t respond! So he might actually be collecting confirmations for other purposes…

Anyways, it shows that many spammers are generally amateurs, not knowing what they’re doing. Some might work for some business and think they can promote it this way while others are just freelance developers trying to find a work in the current market. Both will generally learn that these kinds of emails are spam and generally end up being blacklisted or loose their free email account. The problem is not that they really want to spam people, but they are misguided in thinking that you can just send emails to everyone as part of their marketing strategy!

Unfortunately, it doesn’t work that way! If you send these kinds of messages unsolicited then you are spamming. If you seek new customers then you should start by registering your own domain name and provide proper information about yourself. Use your own domain name for sending emails and not some free provider and more important: use mailer software where people can subscribe and unsubscribe and only mail people who have subscribed! Also provide a simple web-based solution to unsubscribe as a link in your email. People might still consider it spam but at least the risks of being blacklisted becomes less as you’re conforming to the anti-spamming rules.

If you want to do proper business online then you need to be familiar with the rules. You should know about spam and how to avoid to becoming a spammer. You should have a clear profile of your business online, preferably under your own domain name. And you need to know about the legislations of the countries that you’re targeting like the cookie-laws and privacy laws in Europe. Thing is, if your site and services are targeting foreign nations then you are operating under their laws also! Never forget that!

And with that, this lesson ends…Marianne In Office.png

The need of security, part 3 of 3.

Azra Yilmaz Poses III

Enter a caption

Do we really need to hash data? And how do we use those hashed results? That is the current topic.

Hashing is a popular method to generate a key for a piece of data. This key can be used to check if the data is unmodified and thus still valid. It is often used as an index for data but also as a way to store passwords. The hashed value isn’t unique in general, though. It is often just a large number between a specific range of values. If this range happens to be 0 to 9, it would basically mean that any data will result in one of 10 values as identifier, so if we store 11 pieces of data as hashes, there will always be two pieces of data that generate the same hash value. And that’s called collisions.

There are various hashing algorithms that are created to have a large numerical range to avoid collisions. Chances of collisions are much bigger in smaller ranges. Many algorithms have also been created to generate a more evenly distribution of hash values which further reduces the chance of collisions.

So, let’s have a simple example. I will hash a positive number into a value between 0 and 9 by adding all digits to get a smaller number. I will repeat this for as long as the resulting number is larger than 9. So the value 654321 would be 6+5+4+3+2+1 or 21. That would become 2+1 thus the hash value would be 3. Unfortunately, this algorithm won’t divide all possible hash values equally. The value 0, will only occur when the original value is 0. Fortunately, the other numbers will be divided equally as can be proven by the following piece of code:

Snippet

using System;
 
namespace SimpleHash
{
    class Program
    {
        static int Hash(int value)
        {
            int result = 0;
            while (value > 0)
            {
                result += value % 10;
                value /= 10;
            }
            if (result >= 10) result = Hash(result);
            return result;
        }
 
        static void Main(string[] args)
        {
            int[] index = new int[10];
            for (int i = 0; i < 1000000; i++) { index[Hash(i)]++; }
            for (int i = 0; i < 10; i++) { Console.WriteLine("{0}: {1}", i, index[i]); }
            Console.ReadKey();
        }
    }
}

Well, it proves it only for the values up to a million, but it shows that 999,999 of the numbers will result in a value between 1 and 9 and only one in a value of 0, resulting in exactly 1 million values and 10 hash values.

As you can imagine, I use a hash to divide a large group of numbers in 10 smaller groups. This is practical when you need to search for data and if you have a bigger hash result. Imagine having 20 million unsorted records and a hash value that would be between 1 and 100,000. Normally, you would have to look through 20 million records but if they’re indexed by a hash value, you just calculate the hash for a piece of data and would only have to compare 200 records. That increases the performance, but at the cost of maintaining an index which you need to build. And the data needs to be an exact match, else the hash value will be different and you would not find it.

But we’re focusing on security now and the fact that you need to have a perfect match makes it a perfect way to check a password. Because you want to limit the amount of sensitive data, you should not want to store any passwords. If a user forgets a password, it can be reset but you should not be able to just tell them their current password. That’s a secret only the user should know.

Thus, by using a hash, you make sure the user provides the right password. But there is a risk of collisions so passwords like “Wodan5tr1ke$Again” and “123456” might actually result in the same hash value! So, the user thinks his password is secure, yet something almost everyone seems to have used as password will also unlock all treasures! That would be bad so you need two things to prevent this.

First of all, the hash algorithm needs to provide a huge range of possible values. The more, the better. If the result happens to be a 256-bit value then that would be great. Bigger numbers are even more welcome. The related math would be more complex but hashing algorithms don’t need to be fast anyways. Fast algorithms actually speed up brute-force attacks so with hashing, slower algorithms are better. The user can wait a second or two. But for a hacker, two seconds per attempt means he’ll spent weeks, months or longer just to try a million possible passwords through brute force.

Second of all, it is a good idea to filter all simple and easy to guess passwords and use a minimum length requirement together with an added complexity requirement like requiring upper and lower case letters together with a digit and special character. And users should not only pick a password that qualifies for these requirements but also when they enter a password, these checks should be performed before you check the hash value for the password. This way, even if a simple password collides with one of the more complex ones, it still will be denied since it doesn’t match the requirements.

Use regular expressions, if possible, for checking if a password matches all your requirements and please allow users to enter special characters and long passwords. I’ve seen too many sites which block the use of special characters and only use the first 6 characters for whatever reason, thus making their security a lot weaker. (And they also tend to store passwords in plain-text to add to the insult!)

Security is a serious business and you should never store more sensitive data than needed. Passwords should never be stored anyways. Only hashes.

If you want to make even a stronger password check, then concatenate the user name to the password. Convert the user name to upper case, though. (Or lower case) so the user name is case-insensitive. Don’t do the same with the password, though! The result of this will be that the username and password together will result in a hash value, so even if multiple people use the same password, they will still have different hashes.

Why is this important? It is because some passwords happen to be very common and if a hacker knows one such password, he could look in the database for similar hashes and he would know the proper passwords for those accounts too! By adding the user name, the hash will be different for every user,  even if they all use the same password. This trick is also often forgotten yet is simple enough to make your security a lot more secure.

You can also include the timestamp of when the user registered their account, their gender or other fixed data that won’t change after the account is created. Or if you allow users to change their account name, you would require them to provide their (new) password too, so you can calculate the new hash value.

The need of security, part 2 of 3.

Azra Yilmaz Poses II

Enter a caption

What is encryption and what do we need to encrypt? That is an important question that I hope to answer now.

Encryption is a way to protect sensitive data by making it harder to read the data. It basically has to prevent that people can look at it and immediately recognize it. Encryption is thus a very practical solution to hide data from plain view but it doesn’t stop machines from using a few extra steps to read your data again.

Encryption can be very simple. There’s the Caesar Cipher which basically shifts letters in the alphabet. In a time when most people were illiterate, this was actually a good solution. But nowadays, many people can decipher these texts without a lot of trouble. And some can do it just inside their heads without making notes. Still, some people still like to use ROT13 as a very simple encryption solution even though it’s almost similar to having no encryption at all. But combined with other encryption methods or even hashing methods, it could be making encrypted messages harder to read, because the input for the more complex encryption method has already a simple layer of encryption.

Encryption generally comes with a key. And while ROT13 and Caesar’s Cipher don’t seem to have one, you can still build one by creating a table that tells how each character gets translated. Than again, even the mathematical formula can be considered a key.

Having a single key will allow secret communications between two or more persons and thus keep data secure. Every person will receive a key and will be able to use it to decrypt any incoming messages. These are called symmetric-key algorithms and basically allows communication between multiple parties, where each member will be able to read all messages.

The biggest problem of using a single key is that the key might fall into the wrong hands, thus allowing more people access to the data than originally intended. That makes the use of a single key more dangerous in the long run but it is still practical for smaller sessions between multiple groups, as long as each member has a secure access to the proper key. And the key needs to be replaced often.

A single key could be used by chat applications where several people will join the chat. They would all retrieve a key from a central environment and thus be able to read all messages. But you should not store the information for a long time.

A single key can also be used to store sensitive data into a database, since you would only need a single key to read the data.

A more popular solution is an asymmetric-key algorithm or public-key algorithm. Here, you will have two keys, where you keep the private (master) key and give others the public key. The advantage of this system is that you can both encrypt and decrypt data with one of the two keys, but you can’t use the same key to reverse that action again. Thus it is very useful to send data into a single direction. Thus the private key encrypts data and you would need the public key to decrypt it. Or the public key encrypts data and you would need the private key to decrypt it.

Using two keys thus limits communication to a central hub and a group of people. Everything needs to be sent to the central hub and from there it can be broadcasted to the others. For a chat application it would be less useful since it means the central hub has to do more tasks. It needs to continuously decrypt and encrypt data, even if the hub doesn’t need to know the content of this data.

For things like email and secure web pages, two keys is practical, though. The mail or web server would give the public key to anyone who wants to connect to it so they can encrypt sensitive data before sending it to the server. And only the server can read it by using the private key. The server can then use the private key to encrypt new data and send it to the visitor, who will use the public key to decrypt the message again. Thus, you have secure communications between two parties.

Both methods have some very secure algorithms but also some drawbacks. Using a single key is risky if that key falls into the wrong hands. One way to solve this is by sending the single key using a two-key algorithm to the other side! That way, it is transferred in a secure way, as long as the key used by the receiver is secret enough. In general, that key should need to be a private key so only the recipient can read the single-key you’ve sent.

A single key is also useful when encrypting files and data inside databases since it would only require one key for both actions. Again, you would need to store the key in a secure way, which would again use a two-key algorithm. You would use a private key to encrypt the single key and include a public key in your application to decrypt this data again. You would also use that public key inside your applications only but it would allow you to use a single public key in multiple applications for access to the same data.

As I said, you need to limit access to data as much as possible. This generally means that you will be using various different keys for various purposes. Right now, many different encryption algorithms are already in use but most developers don’t even know if the algorithm they use is symmetrical or asymmetrical. Or maybe even a combination of both.

Algorithms like AES, Blowfish and RC4 are actually using a single key while systems like SSH, PGP and TLS are two-key algorithms. Single-key algorithms are often used for long-term storage of data, but the key would have additional security to avoid easy access to it. Two-key algorithms are often used for message systems, broadcasts and other forms of communication because it is meant to go into a single direction. You don’t want an application to store both a private key and matching public key because it makes encryption a bit more complex and would provide a hacker a way to get the complete pair.

And as I said, a single key allows easier communications between multiple participants without the need for a central hub to translate all messages. All the hub needs to do is create a symmetrical key and provide it to all participants so they can communicate with each other without even bothering the central hub. And once the key is deleted, no one would even be able to read this data anymore, thus destroying almost all traces of the data.

So, what solution would be best for your project? Well, for communications you have to decide if you use a central hub or not. The central hub could archive it all if it stays involved in all communications, but you might not always want this. If you can provide a single key to all participants then the hub won’t be needed afterwards.

For communications in one single direction, a two-key algorithm would be better, though. Both sides would send their public key to the other side and use this public key to send messages, which can only be decrypted by the private key which only one party has. It does mean that you actually have four keys, though. Two private keys and two public keys. But it happens to be very secure.

For data storage, using a single key is generally more practical, since applications will need this key to read the data. But this single key should be considered to be sensitive thus you need to encrypt it with a private key and use a public key as part of your application to decrypt the original key again.

In general, you should use encryption whenever you need to store sensitive data in a way that you can also retrieve it again. This is true for most data, but not always.

In the next part, I will explain hashing and why we use it.

The need of security, part 1 of 3.

Azra Yilmaz Poses I

Enter a caption

Of all the things developers have to handle, security tends to be a very important one. However, no one really likes security and we rather live in a society where you can leave your home while keeping your front door open. We generally don’t want to deal with security because it’s a nuisance!

The reality? We lock our doors, afraid that someone gets inside and steal things. Or worse, waits for us to return to kill us. We need it to protect ourselves since we’re living in a world where a few people have very bad intentions.  And we hate it because security costs money, since someone has to pay for the lock. And it takes time to use it, because locking and unlocking a door is still an extra action you need to take.

And when you’re developing software, you generally have the same problem! Security costs money and slows things down a bit. And it is also hard to explain to a client why they have to pay for security and why the security has to cost so much. Clients want the cheapest locks, yet expect their stuff is as safe as Fort Knox or even better.

The worse part of all security measures is that it’s never able to keep everyone out. A lock on your door won’t help if you still leave the window open. And if the window is locked, it is still glass that can be broken. The door can be kicked in too. There are always a lot of ways for the Bad People to get inside so what use is security anyways?

Well, the answer is simple: to slow down any would-be attacker so he can be detected and dealt with, and to make the break-in more expensive than the value of the loot stored inside. The latter means that the more valuable the loot is, the stronger your security needs to be. Fort Knox contains very valuable materials so it has a very strong security system with camera’s and lots of armed guards and extremely thick walls.

So, how does this all translate to software? Well, simple. The data is basically the loot that people are trying to get at. Legally, data isn’t property or doesn’t even has much legal protection so it can’t be stolen. However, data can be copyrighted or it can contain personal information about people. Or, in some cases, the data happens to be secrets that should not be exposed to the outside world. Examples of these three would be digital artwork, your name and bank account number or the formula for a deadly poison that can be made from basic household items.

Of all this data, copyrighted material is the most common item to protect, and this protection is made harder because this material is meant to be distributed. The movie and music industry is having a very hard time protecting all copyrighted material that they have and the same applies to photographers and other graphical artists. But also software developers. The main problem is that you want to distribute a product in return for payment and people are getting it without paying you. You could consider this lost profit, although if people had no option but pay for your product, they might not have wanted it in the first place. So the profit loss is hard to prove.

To protect this kind of material you will generally need some application that can handle the data that you’re publishing. For software, this would be easy because you would include additional code to your project that will check if the software has been legally installed or not. Often, this includes a serial number and additional license information and nowadays it tends to include calling a special web server to check if licenses are still valid.

For music and films, you can use a technique called DRM which works together with proper media players to make additional checks to see if the media copy happens to be from a legal source or not. But it would limit the use of your media to media players that support your DRM methods. And to get media players to support your DRM methods, you need to publish those methods and hope they’re secure enough. But DRM has already been bypassed by hackers many times so it has proven to be not as effective as people hoped.

And then there’s a simpler option. Add a copyright notice to the media. This is the main solution for artwork anyways, since there’s no DRM for just graphic images. You might make the image part of an executable but then you have to build your own picture viewer and users won’t be able to use your image. Not many people want to just see images, unless it is pornography. So you will have to support the basic image file formats, which are generally .JPG or .PNG for any image on the Internet. Or .GIF for animations. And you protect them by adding a warning in the form of a copyright notice. Thus, if someone is misusing your artwork and you discover the use of your art without a proper license, then you can start legal actions against the violator and claim damages. This would start by sending a bill and if they don’t pay, go to court and have a judge force them to pay.

But media like films, music and images tend to be hard to protect and often require going to court to protect your intellectual property. And you won’t always win such cases either.

Next on the list is sensitive, personal information. Things like usernames and passwords, for example. One important rule to remember is that usernames should always be encrypted and passwords should always be hashed. These are two different techniques to protect data and will be explained in the next parts.

But there is more sensitive data that might need to be stored and which would be valuable. An email address could be misused to spam people so that needs to be encrypted. Name, address and phone numbers can be used to look up people and annoy those people by ordering stuff all over the Internet and have it sent to their address. Or to make fake address changes to change their address to somewhere else, so they won’t receive any mail or other services. Or even to visit the address, wait until the people left the house and then break in. And what has happened in the past with addresses of young children is that a child molester learns of their address and goes to visit them to rape and/or kidnap them. So, this information is also sensitive and needs to be encrypted.

Other important information would be bank account information, medical data and employment history would be sensitive enough to have encrypted. Order information from visitors might also be sensitive if the items were expensive since those items would become interesting things to steal. You should basically evaluate every piece of information to determine if it needs to be encrypted or not. In case of doubts, encrypt it just to make it more secure.

Do keep in mind that you can often generate all kinds of reports about this personal data. A simple address list of all your customers, for example. Or the complete medical file of a patient. These documents are sensitive too and need to be protected, but they’re also just basic media like films and artwork so copies of those reports are hard to protect and often not protected by copyrights. So be very careful with report generators and have report contain warnings about how sensitive the data in it actually is. Also useful is to have a cover page included as the first page of a report, in case people will print it. The cover page would thus cover the content if the user keeps it closed. It’s not much protection but all small bits are useful and a cover page prevents easy reading by passer-by’s of the top page of the report.

Personal information is generally protected by privacy laws and thus misuse of personal information is often considered a criminal offense. This is unlike copyright violations, which are just civil offenses in general. But if you happen to be a source of leaking personal information, you and your company could be considered guilty of the same offense and will probably be forced to pay for damages and sometimes a large fine in case of clear negligence in protecting this data.

The last part of sensitive data tends to be ideas, trade secrets and more. In general, these are just media files like reports and thus hard to protect, although there are systems that could store specific data as personal data so you can limit access to it. Ideas and other similar data are often not copyrightable. You can’t get copyright on an idea. You can only get copyright over the document that explains your idea but anyone who hears about your idea can just use it. So if you find a solution for cleaner energy, anyone else could basically build your idea into something working and make profits from it without providing you any compensation. They don’t even have to say it was your idea!

Still, to protect ideas you can use a patent, which you will have to register in many countries just to protect your idea everywhere. Patents become open to the public so everyone will know about it and be able to use it, but they will need to compensate you for using your idea. And you can basically set any price you like. This system tends to be used by patent trolls in general, since they describe very generic ideas and then go after anyone who seems to use something very similar to their idea. They often claim an amount of damages that would be lower than the legal amount it would cost the accused to fight back, so they tend to get paid for this trolling. This is why many are calling for patent reforms to stop these patent trolls from abusing the system.

So, ideas are very sensitive. You generally don’t want to share them with the generic public since it would allow others to implement your ideas. Patents are a bit expensive and not always easy to protect. And you can’t patent everything anyways. Some patents will be refused because they’ve already been patented before. And yet you still need to share them with others so you can build the idea into a project. And for this, you would use a non-disclosure agreement or NDA.

An NDA is basically a contract to make sure you can share your idea with others and they won’t be allowed to share it with more people without your permission. And if your idea does get leaked, those others would have to compensate your financial losses due to leakage as mentioned in the NDA contract. It’s not very secure but it generally does prevent people from leaking your ideas.

Well, except for possible whistleblowers who might leak information about any illegal or immoral parts of your idea. For example, if your idea happens to be to blow up the subway in Amsterdam and have an NDA with a few other terrorists to help you then it becomes difficult when one of those others just walks to the police to report you and those who help you. The NDA just happens to be a contract and can be invalidated for many reasons, including the more obvious criminal actions that would relate to it.

But there are also so-called blacklists of things you can’t force in an NDA, depending on the country where you live. It is just a contract and thus handled by the Civil courts. And if the NDA violates the rights of those who sign it then it could be invalid. One such thing would be the right of free speech, where you would ban people from even discussing if your idea happens to be good or not.

Other sensitive information would be things like instructions on how to make explosives or business information about the future plans of Intel, which could influence the stock market. Some of this information could get you into deep trouble, including the Civil Court or Criminal Court as part of your troubles, resulting in fines and possibly imprisonment if they are leaked.

In general, sensitive information isn’t meant to be shared with lots of people so you should seriously limit access to such information. It should not be printed and you should not email this information either. The most secure location for this information would be on a computer with no internet connection but having a strong firewall that blocks most access methods would be good enough for many purposes.

So we have media, which is hard to protect because it is meant to be published. We have sensitive data which should not fall in the wrong hands for various reasons and we have personal data, which is basically a special case of sensitive data that relates to people and thus has additional laws as protection.

And the way to secure it is by posting warnings and limiting access to the data, which is difficult if it was meant to be published. But for those data that we want to keep private, we have two ways of protecting it next to limiting the physical access to this data.

To keep things private, you will need to have user accounts with passwords or other security keys to lock the data and limit access to it. And these user accounts are already sensitive data so you should start with protecting it here, already.

Of all the things software developers do, security happens to be the most complex and expensive part, since it doesn’t provide any returns on investments made. All it does is try to provide assurance that data will only be available for those who are meant to use it.

The two ways to protect data is through encryption and through hashing, which are two similar things, yet also differ in their purpose. I will discuss both in my next posts.