How SSL is failing…

Every web developer should know about the Secure Sockets Layer and it’s successor, the Transport Layer Security. (SSL and TLS.) These techniques are nowadays a requirement to keep the Internet secure and to keep private matters private. But while this technique is great, the implementation of it has some flaws.

Flaws that aren’t easy to resolve, though. But to understand, you will need to understand how this technique works when you use it to protect the communication between a visitor and a website.

This is a very long post as it is a complex topic and I just don’t want to split it in multiple posts. So I add several headers to separate parts…

Femke Wittemans ArmoredWhat is SSL?

When you communicate with a website, you’re basically have a line with two endpoints. You on one side and the web server on the other side. But this isn’t a direct line but goes from your computer to your router/modem, to your provider, over some other nodes on the Internet, to the provider of the host, the router of the host to the computer where the host is running the website on. And this is actually a very simplified explanation!

But it shows a problem. The communication between the two endpoints goes over various node and at every node, someone might be monitoring the communication, listening for sensitive information like usernames, passwords, social security numbers, credit card numbers and whole lot more. And to make it even more harder, you don’t know if your information will travel over nodes that you can trust! So, you need to make sure that the information you send over between both endpoints is secure. And that’s where SSL is used.

SSL is an encryption technique with asynchronous keys. That means that the host has a private key and a public key. It keeps the private key hidden while giving everyone the public key. You can generally use one key to encrypt a message but you would need the other key to decrypt it again. And that’s basically how security works! You have the public key and you encrypt your data with it before sending it to me. No one on those nodes have the private key, except me, so I’m the only one who can decrypt it. I can then encrypt a response with the private key and send that back to you, as you can decrypt it again with the public key.

Unfortunately, everyone on those nodes can too, as they would all know this public key as I had just sent it to you. So things need to be a little more secure here. Which fortunately is the case. But in general, you should never encrypt sensitive data with a private key as anyone with the public key would be able to read it!

Session keys…

But the Internet uses a better trick. Once you receive the public key from the web server, your browser will generate a session key, which is a synchronous encryption key. This session key can be used by anyone who knows it and right now, you would be the only one. But as you have my public key, you would use my public key to encrypt the session key and send it to me. Now I know it too and we can have a communication session using this session key for security. And once the session is done, the session key can be discarded as you would make a new one for every session.

Most people think the encryption is done using the SSL certificates but that’s not the case! SSL is only used to send a session key to the server so both can communicate safely. That is, as long as no one else knows that session key. Fortunately, session keys are short-lived so there isn’t much time for them to fall in the wrong hands…

The main weakness…

So, this seems pretty secure, right? My site sends you a public key, you create a session key and then we communicate using this session key and no one who is listening in on us can know what we are talking about! And for me it would be real easy to make the SSL certificate that would be used as most web servers can do this without any costs, and generally within just a few minutes. So, what can go wrong?

Well, the problem is called the “Man in the Middle” attack and this means that one of the nodes on the line will listen in on the communication and will intercept the request for secure communications! It will notice that you ask for a public key so it will give its own public key to you instead that of the host. It also asks for the public key of the host so it can relay all communications. You would then basically set up a secure line with the node and the node does the same with the host and will be able to listen in to anything that moves between you, as it has to decrypt and then encrypt each and every message. So, it can listen to sensitive data without you realizing that this is happening!

Authorities…

So the problem is that you need to know that the public key I gave you is my public key, and not the key of this node. How do you know for sure that this is my key? Well, this is where the Certificate Authorities (CA) have a role.

The CA has a simple role of validating the certificates that I use for my website. I want to secure my host so I make a certificate. I then ask the CA to sign this certificate for me. The CA then checks if I really am the owner of the domain at that specific moment or have at least some control over the domain. And when they believe that I’m the owner, they will sign my certificate.

Then, when you receive my public key then you can check the credentials of this certificate. It should tell you if it is for the specific domain that you’re visiting and it should be signed by the CA who issued the certificate. If it is signed correctly then the CA will have confirmed that this certificate is linked to my host and not that of some node between you and me.

Trust…

But the problem is that when your connection isn’t secure because some node is trying to listen than checking if my certificate is properly signed by the CA won’t work, as you would be requesting the CA to validate it over the same unsafe connection. The node will just claim it is so that option won’t work. No, you already need to know the public key of the CA on your system so you can decrypt my signature. And you need to have received the Ca’s certificate from some secure location. Otherwise, you can’t trust if my public key is the real thing.

So most web browsers have a list of public keys as part of their setup. When they install themselves, they will also include several trustworthy public keys from the more popular CA’s. Basically, the CA’s they deem reliable. So your browser will validate any public key from my site with the public keys it knows and trusts and if everything is okay, you will be notified that the connection is secure and the session key can be generated for further secure communications.

Otherwise, your browser will give you a warning telling you what’s wrong with the certificate. For example, it might be outdated or not meant for the specific domain. In general, you should not continue as the connection with the host has security problems!

Distrust!…

But here’s the thing… The list of trusted CA’s in your browser can be modified and to be honest, it sometimes gets modified for various reasons. Some are legitimate, others are not.

For example, this list is modified when a specific CA is deemed unreliable. This happens regularly with smaller CA’s but once in a while, some major scandal happens. For example, in 2011 it was discovered that the company DigiNotar had a security breach which had resulted in several certificates being falsified. Most likely, the Iranian Government was behind this in an attempt to check all emails that their citizens were sending through GMail. The fake certificates allowed them to listen in on all emails using the man in the middle technique. DigiNotar went bankrupt shortly afterwards, as all the certificates they had issued had become worthless.

Similar problems occurred at StartCom, a CA that actually gave away free certificates. The Israeli company was purchased by a Chinese company and some suspicious behavior happened soon afterwards. The fear was that this Chinese company (and perhaps even the Chinese government) would use this trust that StartCom had to make fake certificates to listen in on all communications in China. Both Mozilla and Google started to raise questions about this and didn’t get satisfying answers so they decided to drop the StartCom certificates. This CA had become controversial.

And then there’s Symantec. Symantec is a company that has been making software for decades that all relate to security. It is an American company and has been trustworthy for a long time. And in 2010 Symantec acquired Verisign’s authentication business unit which includes releasing SSL certificates for websites. But in 2015 it was discovered by Google that Symantec had issued several test certificates for impersonating Google and Opera. Further research has led Google to believe that Symantec has been publishing questionable certificates for over 7 years now and thus they announced that they will distrust Symantec SSL certificates in the near future. In April 2018, all Symantec certificates will be useless in Google Chrome and other browsers might follow soon.

Also interesting is that Symantec is selling their SSL business to DigiCert. This could solve the problem as DigiCert is still trusted. Or it makes things worse when browser manufacturers decide to distrust DigiCert from now on also.

But also: Telecom!

But there are more risks! Many people nowadays have mobile devices like tablets and phones. These devices are often included with a subscription to services of some mobile phone company. (T-Mobile and Vodafone, for example.) These companies also sell mobile devices to their customers and have even provided “free phones” to new subscriptions.

However, these companies will either provide you with a phone that has some of their software pre-installed on your new device or will encourage you to install their software to make better use of their services. The manufacturers of these mobile devices will generally do similar things if given a chance. And part of these additions they make to your Android or IOS device is to include their own root certificates with the others. This means that they are considered trustworthy by the browser on your device.

Is that bad? Actually, it is as it allows these companies to also do a man in the Middle attack on you. Your telecom provider and the manufacturer of your phone would be able to listen to all your data that you’re sending and receiving! This is worse, as local government might require these companies to listen in on your connection. It is basically a backdoor to your device and you should wonder why you would need to trust your provider directly. After all, your provider is just another node in your connection to the host.

Did you check the certificate?

So the problem with SSL is that it’s as reliable as the Certificate Authorities who made those certificates. It’s even worse if your device has been in the hands of someone who wants to listen in on your secure connections as they could install a custom trusted certificate. Then again, even some malware could install extra public keys in your trusted certificates list without you noticing. So while it is difficult to listen to your secure conversations, it is not impossible.

You should make a habit of checking any new SSL certificate that you see pop up in your browser and it would be a good idea if browsers would first ask you to check a certificate whenever they detect that a site has a new one. It would then be up to the user to decide to trust the certificate or not. And those certificates would then be stored so the browser doesn’t need to ask again.

Unfortunately, that would mean that you get this question to trust a certificate very often when you’re browsing various different sites and users will tend to just click ‘Ok’ to get rid of the question. So, that’s not a very good idea. Most users aren’t even able to know if a certificate is trustworthy or not!

For example, while you’re reading this blog entry, you might not have noticed that this page is also a secured page! But did you check the certificate? Did you notice that it is signed by Automattic and not by me? That it’s a certificate issued by “Let’s Encrypt“? Well, if the certificate is showing this then it should be the right one. (But when my host Automattic changes to another CA, this statement becomes invalid.)

ACME protocol…

And here things become interesting again. “Let’s Encrypt” gives away free certificates but they are only valid for a short time. This makes sense as domains can be transferred to other owners and you want to invalidate the old owner’s certificates as soon as possible. It uses a protocol called ACME which stands for “Automatic Certificate Management Environment. It’s basically a toolkit that will automate the generation of certificates for domains so even though the certificates are only valid for a short moment, they will be replaced regularly. This is a pretty secure setup, although you’d still have to trust this CA.

Problem is that “Let’s Encrypt” seems to prefer Linux over Windows as there is almost no good information available on how to use ACME on Windows in IIS. But another problem is that this protocol is still under development and thus still has some possible vulnerabilities. Besides, it is very complex, making it useless for less technical developers. The whole usage of certificates is already complex and ACME doesn’t make things easier to understand.

Also troublesome is that I tried to download the ACME client “Certify the Web” for Windows but my virus scanner blocked the download. So, now I have to ask myself if I still trust this download. I decided that it was better not to trust them, especially as I am trying to be secure. Too bad as it seemed to have a complete GUI which would have made things quite easy.

Don’t ignore security warnings! Not even when a site tells you to ignore them…

Additional problems?

Another problem with SSL is that it is an expensive solution so it is disliked by many companies who are hosting websites. It’s not the cost for the certificates, though. It’s the costs for hiring an expert on this matter and making sure they stay with the company! A minor issue is that these security specialists do have access to very sensitive material for companies so you need to be sure you can trust the employee.

Of course, for very small companies and developers who also host websites as a hobby, using SSL makes things a bit more expensive as the costs are generally per domain or sub domain. So if you have three domains and 5 subdomains then you need to purchase 8 certificates! That’s going to easily cost hundreds of euros per year. (You could use a Multi-Domain (SAN) Certificate but that will cost about €200 or more per year.)

Plus, there’s the risk that your CA does something stupid and becomes distrusted. That generally means they will have to leave the business and that the certificates you own are now worthless. Good luck trying to get a refund…

But another problem is that the whole Internet is slowly moving away from insecure connections (HTTP) to secure (HTTPS) connections, forcing everyone to start using SSL. Which is a problem as it starts to become a very profitable business and more and more malicious people are trying to fool people into buying fake or useless certificates or keep copies of the private key so they can keep listening to any communications done with their keys. This security business has become highly profitable!

So, alternatives?

I don’t know if there are better solutions. The problem is very simple: Man in the Middle. And the biggest problem with MITM is that he can intercept all communications so you need something on your local system that you already can trust. Several CA’s have already been proven untrustworthy so who do you trust? How do you make sure that you can communicate with my server without any problems?

There is the Domain Name Service but again, as the MITM is intercepting all your transactions, they can also listen in on DNS requests and provide false information. So if a public key would be stored within the DNS system, a node can just fake this when you request for it. The MITM would again succeed as you would not be able to detect the difference.

Blockchain?

So maybe some kind of blockchain technology? Blockchains have proven reliable with the Bitcoin technology as the only reason why people have lost bitcoins is because they were careless with the storage of their coins. Not because the technique itself was hacked. And as a peer-to-peer system you would not need a central authority. You just need to keep the blocks on your system updated at all times.

As we want the Internet to be decentralized, this technology would be the best option to do so. But if we want to move security into blockchains then we might have to move the whole DNS system into a blockchain. This would be interesting as all transactions in blockchains are basically permanent so you don’t have to pay a yearly fee to keep your domain registered.

But this is likely to fail as Bitcoin has also shown. Some people have lost their bitcoins because their disks crashed and they forgot to make a copy of their coins. Also, the file size of the Bitcoin blockchain has grown to 100 GB of data in 2017 which is quite huge. The whole DNS system is much bigger than Bitcoin is so it would quickly have various problems. Your browser would need to synchronize their data with the network which would take longer and longer as the amount of data grows every day.

So, no. That’s not really a good option. Even though a decentralized system sounds good.

Conclusion?

So, SSL has flaws. Then again, every security system will have flaws. The main flaw in SSL is that you need to trust others. And that has already proven to be a problem. You have to be ultra-paranoid to want to avoid all risks and a few people are this paranoid.

Richard Stallman, for example, is a great expert on software yet he doesn’t use a mobile phone and avoids using the Internet directly. Mobile phones are “portable surveillance and tracking devices” so he won’t use them. He also avoids key cards and other items that allow people to track wherever he goes. And he generally doesn’t access the Web directly, as this too would allow people to track what he’s doing. (He does use Tor, though.) And maybe he’s on to something. Maybe we are putting ourselves in danger with all this online stuff and various devices that we have on our wrists, in our pockets and at our homes.

Thing is that there is no alternative for SSL at this moment so being paranoid is useful to protect yourself. Everyone should be aware of the risks they take when they visit the Internet.  This also depends on how important or wealthy you are, as poor, boring people are generally not interesting for malicious people. There isn’t much to gain from people with no money.

Still, people are still too careless when they’re online. And SSL isn’t as secure as most people think, as events from the past have already proven…

 

How Yahoo has failed.

As many people have already read, Yahoo had a severe data leak in the past which resulted in ALL YAHOO ACCOUNTS being leaked to hackers. The hack includes sensitive personal information and includes an MD5 hash of the password you’ve used with Yahoo. This is a very serious issue as Yahoo has told me today in an email. It says:

Yahoo
UPDATED NOTICE OF DATA BREACH
Dear Yahoo User,
We are writing to update you about a data security issue Yahoo previously announced in December 2016. Yahoo already took certain actions in 2016, described below, to help secure your account in connection with this issue.
What Happened?On December 14, 2016, Yahoo announced that, based on its analysis of data files provided by law enforcement, the company believed that an unauthorized party stole data associated with certain user accounts in August 2013. Yahoo notified the users it had identified at that time as potentially affected. We recently obtained additional information and, after analyzing it with the assistance of outside forensic experts, we have determined that your user account information also was likely affected.
What Information Was Involved?

The stolen user account information may have included names, email addresses, telephone numbers, dates of birth, hashed passwords (using MD5) and, in some cases, encrypted or unencrypted security questions and answers. Not all of these data elements may have been present for your account. The investigation indicates that the information that was stolen did not include passwords in clear text, payment card data, or bank account information. Payment card data and bank account information are not stored in the system we believe was affected.
What We Are Doing

In connection with the December 2016 announcement, Yahoo took action to protect users (including you) beyond those identified at that time as potentially affected. Specifically:

  • Yahoo required potentially affected users to change their passwords.

  • Yahoo also required all other users who had not changed their passwords since the time of the theft to do so.

  • Yahoo invalidated unencrypted security questions and answers so they cannot be used to access an account.

We are closely coordinating with law enforcement on this matter, and continue to enhance our systems that detect and prevent unauthorized access to user accounts.

What You Can Do

While Yahoo already has taken action to help secure your account, we encourage you to consider the following account security recommendations:

  • Change your passwords and security questions and answers for any other accounts on which you used the same or similar information used for your Yahoo account.

  • Review your accounts for suspicious activity.

  • Be cautious of any unsolicited communications that ask for your personal information or refer you to a web page asking for personal information.

  • Avoid clicking on links or downloading attachments from suspicious emails.

Additionally, please consider using Yahoo Account Key, a simple authentication tool that eliminates the need to use a password on Yahoo altogether.
For More Information

For more information about this issue and our security resources, please visit the Yahoo 2013 Account Security Update FAQs page available at https://yahoo.com/security-update.

We value the trust our users place in us, and the security of our users remains a top priority.

Sincerely,
Chris Nims
Chief Information Security Officer

And yes, that’s bad… it’s even worse as the hack occurred in 2013 and it has taken Yahoo 4 years to confess everything about the hack. Well, everything? I’m still not sure if we’ve heard everything about this case. Worse, as Verizon recently took over Yahoo for a large sum of money, it could even have an impact for anyone using the Verizon services.

But there is more as people might not realise that the sites Tumblr and Flickr are also part of the Yahoo sites. We know that Yahoo is hacked but how about those other two sites? As I said, we might still not know everything…

Yahoo

About to drown by failing security.

Well, assume the worst. While we might be arming ourselves properly against any of these kinds of hacks, we also chain ourselves to the security provided by companies like Yahoo. And those security measures might not protect us against everything.

Fact is that Yahoo has become great and is becoming even bigger now they’re part of Verizon. As a result, all those 3 billion accounts are now owned by Verizon and we better hope that Verizon will use better security than Yahoo ever did. If not, anyone who ever used Yahoo, Tumblr, Flickr or Verizon might soon drown in security problems as their accounts have been hacked and they will continue to hack those.

Is there a solution to this problem? That’s a good question as there are many other companies that we rely upon for our security. Twitter, Google and Facebook are a few popular sites that are also popular targets for hackers. However, as long as these large corporations immediately notify all users if there’s a serious data breach and immediately respond by increasing security, the risks should be acceptable. What Yahoo did was wrong as it took 4 years before they finally admitted the truth!

So in my opinion, Yahoo has to disappear. It is unacceptable that any company with such a major role on the Internet regarding security is trying to hide the truth and keep people vulnerable instead of responding immediately. So instead of following Yahoo’s advise and change your password, I suggest everyone just close their Yahoo account. Permanently! You might still keep your Flickr and Tumblr account as those might not be involved in this hack but Yahoo should go.

And let’s hope that someone will improve the security on both Tumblr and Flickr as these services are highly popular all over the World.

The riddle of the Holy Men…

I generally stay far away from religion but I do like discussions about religion. And recently, I got involved in a discussion with some fanatic who “knew” he was right and I was wrong, as his Religious counselors have told him so. So I came up with a simple riddle which he could not answer.

It starts with three Holy Men and a large chest. And as I’m talking about religion in general, I make them of three different religions so we have a rabbi, a priest and an imam. But if you talking to a fanatic, make them all Holy Men for his specific religion.

Azra Yilmaz with Chest.png

The rabbi starts and opens the chest to look inside. Once he has seen what is inside, he closes the chest and tells you there’s a statue of a golden calf inside, with precious jewels for his eyes. He describes it in details and he sounds very convincing.

Next, the priest opens the chest and looks inside. After he has seen the content he closes it and tells you what is inside. It is a wooden cross with long, silver nails and a golden hammer. And he too describes it in a lot of details and sounds very convincing.

Last, the imam opens the chest and looks inside. He has seen the content and closes the chest and tells you what is inside. It is a marble statue of a horse with a crescent moon next to it made from pure silver. And it is described in details and also sounds very convincing.

Now, who do you believe? How do you know who is telling the truth?

It’s actually a simple riddle. But for some people, it is still too complex. When they’re from a christian background they are more likely to believe the priest. Jewish people would favor the rabbi. And muslims will believe the imam. But all we know is that at least two of these Holy Men are lying. And maybe all three of them are lying.

When you’re not biased by any religion, it becomes more challenging and you would try to find out how reliable these Holy Men are. You talk with them and look at their actions. Have they committed crimes in the past? Have they been caught lying in the past? And what are they describing exactly? And how about their family? Are they from families that have used lies and crimes in the past?

It takes some time to evaluate who is speaking the truth and as all three of them are very convincing you could just decide to believe all three. However, you know that two of them must have been lying, as not all three objects could have been in the chest.

Or could it? Could it be a magical chest which shows the content that the viewer wants to see? If so, all three of them would have spoken the truth but it requires the chest to be very special. And it makes the content of the chest questionable in value, as what worth does an illusion have?

Or you just don’t care about what’s in the chest. That’s actually the easiest solution. However, all three holy Men tell you that you have to believe one of them as your eternal soul would be in danger if you don’t. All three threaten you that you will go to Hell if you don’t believe in him or believe any of the other two. And that makes it even more challenging as you don’t want to spend your eternal life in Hell.

But if you don’t believe in a soul and eternal life then not even that matters. So the holy Men are telling you to believe them or else they will end your life. You have to believe one or choose death. And if you pick one, the other two will want to kill you, but the Holy Man whom you believe will try to protect you. With the threat of violence and danger for your eternal choice, you will be in big problems and will be forced to make a decision.

So, you wouldn’t make a choice on whom you believe, but you choose the one who is most beneficial for you and your family. You have no choice as you need protection and each Holy Man can provide protection. But when you chose one, the other two will become your enemies.

So you have a big dilemma. You are forced to choose one and will have to believe what he said about what’s in the chest. And this is the same problem as you see with religion. You have basically thousands of “holy men” claiming they, and only they, know the Truth and that you have to believe them, or else. And instead of having just two enemies you would end up with potentially thousands of enemies if you chose to believe one of them. They they will also be enemies if you don’t believe any of them. However, when you follow just one then he will likely protect you against the others.

Or not, as many holy men will actually use those who believe in him to convince them that the others are lying and thus deserve to die. They need to die because they don’t believe in what he is claiming. He knows the truth as he has looked into the chest. And you believe him because you think he’s trustworthy. Right?

Generations later, your children will continue to follow the children of this holy Man, as you’ve educated them to believe in what he told so they believe in what you told them. However, the Holy Man now has three children. And each child tells the story about the content of the chest differently again. One repeats what his grandfather has told. The second one adds a collection of gems and golden coins to the content. The third mentions flasks of wine as part of the content in the chest. And your children will have to make the same choice again as what you had to do. They have to solve the Riddle of the Holy Men.

And this explains why we now have so many different forms of religions. Why we have so many different opinions and don’t know what’s in the chest, as we don’t know whom we really can trust. This is a riddle that will repeat itself every generation, as new Holy Men will be born and each of them claim something different in the content of the chest. All these problems simply because we have to chose whom to believe.

Or do we? The solution of the riddle is actually quite simple. You yourself go to the chest and you yourself will open the chest and look inside. Then you will see what is inside. And when you want others to know what is inside, you will invite them to also take a look inside and keep the chest open. That way, every person can learn what is inside the chest without relying on what others claim they saw.

And that’s what’s science is! In science, you don’t present any beliefs but you show everyone the clear facts and allow them to evaluate the reality themselves by telling them what to do to see those facts. And you allow people to think for themselves and determine for themselves what they are seeing. You suggest theories and others might believe those theories or not. If they want, they can make up their own theories and that’s just fine. Everyone can see inside the chest and see the content.

So the answer of finding the truth is simple. You just have to look for yourself!

Nieuwe ABN-AMRO phishing email!

(Dutch warning about a phishing email targeting ABN-AMRO customers. As it targets Dutch people, I write it in Dutch. Sorry…)

Vandaag weer een spam-bericht in mijn spambox ontvangen waarin men weer probeert om mensen op een link te laten klikken. Ik heb het maar meteen als “Phishing” aangemerkt maar het is een beetje onbegrijpelijk dat mensen hier soms toch intrappen want als je goed oplet zie je dat er niets van klopt!2017-06-16.png

Eerst en vooral komt de email binnen op een account die ik niet gebruik voor deze bank, hoewel ik er wel een account heb. Dit toont maar weer eens aan hoe praktisch het is om je eigen domeinnaam te hebben met een catch-all mailbox zodat je een oneindig aantal email adressen kunt aanmaken.

Andere waarschuwingen zijn de spaties in de datum, de titel “Trouwe Cliënt” en enkele andere taal- en stijlfouten in de tekst. Zo klinkt “betaal kaart” best raar als het om een betaalpas gaat. Duidelijk een gevalletje Google Translate.

Ook het verhaal erachter is vreemd want de bank heeft problemen in hun IT systemen en daardoor moet de klant opeens actie ondernemen? En zolang dat niet gebeurt is de account geblokkeerd?

Interessanter wordt het als je de bron van de email beter gaat controleren. De afzender maakt gebruik van een sub-domein van sodelor.eu en mogelijk is dit gehele domein een phishing-site. In ieder geval heeft het sub-domein een phishing pagina waarin het PayPal nabootst. Sowieso zou je PayPal als afzender verwachten, maar goed. Sommige mensen zijn idioten…

De email bevat ook een URL die verwijst naar een Russische website en dat verbaast mij niets. Russische domeinnamen worden vaak door hackers misbruikt omdat deze vaak eenvoudig te hacken zijn.

Als je verder de bron nakijkt zie je dat deze via de Duitse kundenserver.de worden verstuurd. Dit domein is ondertussen al op diverse blacklists geplaatst wegens de grote hoeveelheid spam die ermee wordt verzonden.

Maar goed, de meest duidelijke detectie dat dit spam is, is omdat het in mijn spam-folder zit.

Delicious spam!

Once more, a post about spam. Why? Because I have one more interesting email in my spam-box, sent by someone who clearly is confused by the whole topic. So, here’s the email, with some annotations:

spam-1486041897301

Why is it spam? Because Google Apps/GMail says it is. And google is often right in these things. And as I don’t know Adam Collier, nor see any name of his company, it clearly seems like spam to me too, from some wannabe web developer in India looking for customers without understanding the rules.

Why  from India? Well, the English writing is more British than American. The writing style is similar to how Indian spam is generally written, with only single-line paragraphs. The skill set used is also very common among Indian developers. The extreme politeness in the writing also is similar to what you see in mostly Asian countries, as people there are generally more polite. Then of course, it mentions India in the email too so that wasn’t difficult.

First of all, this email was sent from a genuine, free email address like those offered by Outlook, Gmail and Yahoo. I’m not going to say if it’s Outlook or not as I allow this guy some anonymity, even though his name is probably fake and the address already closed for sending spam. But for me that’s the first sign of spam. If it is sent from a free mail provider then you should make sure you know the sender before continuing! As usual, check the sender first for every email you receive!

Next is the address to where it was sent. While it seems to be my “info” account, it just isn’t! It was received by the account I used for my registrar and used in my domain registration where it is visible in the WhoIs information, including my name and some other details. The “info” address happens to be the address of some other website, who has also received this email. My address was actually part of the BCC header so other recipients would not see that I had received it. Smart, but it is to be expected from mass mailers as they would really piss off a lot of people if they only use the TO or CC fields, as many people tend to ‘Reply to all’ on spam messages, making even more spam.

So they got my address from the WhoIs database. So they should have known my name too! They just can’t use it as this is a mass email that’s probably sent to hundreds or even more people.As this spammer doesn’t seem to use any mass mailer application, I suspect that he just collected a lot of email addresses from interesting-looking domains and just mailed to them all from Outlook so the amount of recipients is likely to be hundreds, maybe thousands. Not the millions that more experienced spammers will use.

Interesting is how he’s called a webmanager in his email address while calling himself an online marketing manager in the email. No name for his business so maybe he doesn’t even have a real business. This could be a simple PHP developer who is trying to make a freelance web development business and is hoping to get some customers so he can expand his business. He might have a few friends who are also doing development and likely is a student at Computer Science classes in India who wants to put his lessons to the Test. This doesn’t look like a hardcore spammer, even though he is spamming. He’s more a lightweight spammer.

The prices he mentions are very reasonable. Then again, he basically uses standard frameworks like WordPress, Joomla, Magento and Drupal to build those sites which is generally not too much work. I call these “Do not expect too much from us” prices.

There is one major alert in all this, though. The grey line mentions a “Payment Gateway” which you should immediately distrust! Why? Because this developer is probably setting up this payment gateway and might have control over it later on. He could be siphoning off some of the payments made through it or even at one point empty all the money collected and put it in his own bank account! Good luck getting your money back!

Well, he could be honest but you should not take that risk to begin with…

It is interesting to see that he also provides Android and IOS applications. He seems to be specialized in PHP so he would need to know Swift or Objective-C to do the IOS development and Java for the Android development. Or have some other programming environment that allows him to develop for both platforms. He might be using Visual Studio with Xamarin which would allow him to focus on different platforms. Or he has friends who specialized in app development.

At the bottom of his email he tells you that this isn’t spam and that he actually hates spam. So if you aren’t interested you should just reply to him so he can confirm that your email address exists and is in use so he won’t be sending emails to it. Wait… Why does he need that? People who aren’t interested generally won’t respond! So he might actually be collecting confirmations for other purposes…

Anyways, it shows that many spammers are generally amateurs, not knowing what they’re doing. Some might work for some business and think they can promote it this way while others are just freelance developers trying to find a work in the current market. Both will generally learn that these kinds of emails are spam and generally end up being blacklisted or loose their free email account. The problem is not that they really want to spam people, but they are misguided in thinking that you can just send emails to everyone as part of their marketing strategy!

Unfortunately, it doesn’t work that way! If you send these kinds of messages unsolicited then you are spamming. If you seek new customers then you should start by registering your own domain name and provide proper information about yourself. Use your own domain name for sending emails and not some free provider and more important: use mailer software where people can subscribe and unsubscribe and only mail people who have subscribed! Also provide a simple web-based solution to unsubscribe as a link in your email. People might still consider it spam but at least the risks of being blacklisted becomes less as you’re conforming to the anti-spamming rules.

If you want to do proper business online then you need to be familiar with the rules. You should know about spam and how to avoid to becoming a spammer. You should have a clear profile of your business online, preferably under your own domain name. And you need to know about the legislations of the countries that you’re targeting like the cookie-laws and privacy laws in Europe. Thing is, if your site and services are targeting foreign nations then you are operating under their laws also! Never forget that!

And with that, this lesson ends…Marianne In Office.png

Donald Trump is NOT my President!

Bianca Delmonde for Shapeways_0001.pngThat’s because I’m Dutch and still rolling on the floor over this past election result from the USA, showing the utter madness that a Democracy can be, sometimes. Then again, the people in the USA didn’t have much to choose from, did they?

Many people were actually surprised that Trump got elected, which is strange as both the Democratic Party and the Republican Party are both about equally popular. The USA is basically a binary system, as there aren’t many alternative choices. Well, not voting is a choice, albeit a very bad one as you won’t get anything to say afterwards. But judging by the huge amounts of protests from the US population and all the bad things being said about Trump, it still is a bit of a surprise.

But again, binary system! The population of the USA is roughly divided into just two groups and many of these voters are unlikely to switch sides. There are, of course, many swing voters who can go either way but in general, both parties have their own loyal supporters. With no alternative choice than basically two bad candidates, many voters didn’t have any reason to switch sides, although Hillary Clinton did have a past history as First Lady and some Democrats might have wanted to keep her husband out of the White House.

So, today, Donald Trump will be crowned as the New President of the USA. So, utter chaos next?

No, I don’t think so. The Republican Party has been in power many times before and even had some very good Presidents in the past. Lincoln was a Republican. So was Eisenhower and Reagan. Reagan was an important factor in the end of the Cold War, even! Sure, he used to be an actor but in the 1984 elections, he pulverized his Democratic opponent Walter Mondale.

But this election was very close again. Clinton won the Popular Vote but in the USA that isn’t important. Trump just won the most electoral votes.

Then again, many people are also ignoring the fact that Faith Spotted Eagle also won one electoral vote during this election, thus becoming the first Native American to receive one such vote. Well, thanks to a faithless elector in Washington who was supposed to vote for Clinton…

Several other faithless electors decided not to vote for Clinton, which clearly tells me that she wasn’t a favorite among her own party. And that also explains why Trump got elected. Hillary Clinton just wasn’t the best candidate to pick for the Democrats.

First of all, Hillary Clinton would have been the first female POTUS if she was elected. But as the USA business world is still strongly male-oriented, I think her gender was already costing her some votes. She is also known as a former FLOTUS while her husband Bill was in office. And Bill Clinton had smoked a bad cigar while in office so it is understandable that some people would not like to see him return to the White House. Then there were some scandals about emails and an embassy in Libya and something involving the sexual abuse of minors and more fake news that put her in a bad spotlight, but that also happened to Donald Trump. Fact is, this election saw so much fake news that most people stopped believing all of it. They just relied on things they knew from the past.

I think Bernie Sanders would have been a much better choice anyway. Still, the Democrats choose Clinton and they’re allowed to make such mistakes. Considering that Hillary won the Popular Vote, it still wasn’t that bad.

So, what will happen next? Trump is a businessman and as such, he’s known to try and make companies more profitable. Sure, some of his companies went broke but overall, he has been reasonably successful with his businesses. Maybe with the bit of fraudulent support but still, he has collected quite a bit of wealth in his lifetime. Lost a lot too, though.

First of all, his tax plans don’t seem to be that bad. It seems that low-income Americans won’t even have to pay any taxes. And as he wants to reduce the number of brackets from 7 to 3, income tax should become a bit simpler for everyone.

He does want to repeal the Obamacare tax, though. Also called the Affordable Care Act, the Obamacare is considered extremely valuable for many Americans as it means that each and every one of them will at least have some minimum health care so people can get sick without going broke. But repealing the tax doesn’t repeal the act itself. In fact, Trump might find some other solution to fund this care, maybe change a thing or two, have it renamed to Trumpcare and POOF! Obamacare is gone! Yet the system would have barely changed and all Americans would still have basic health care.

Immigration is also a hot topic and Trump wants to build a wall to keep Mexicans out of the USA. And probably have Mexican companies doing most of the building of this wall as the Mexican government might want to subsidize the extra employment for their own population. A more strict policy on illegal immigrants might be useful as it could actually increase the wages of the legal immigrants! So, we’ll just have to wait and see what will happen…

Of course, Donald Trump has his site full with interesting plans that all sound quite nice. He’s a Master Merchant at this, knowing how to sell his ideas to the public, as he has done all his life with all his businesses. Trump is a Master at Selling and we can only hope that he will keep to his promises. But in the end, it isn’t Trump who is in Power. It is the Republican Party that is in Power and Trump is just their main spokesperson. He can’t do much that his party won’t approve, as they would just vote against him and probably even force him to step down, if need be. Barack Obama had similar problems as he had Great Plans, but could not get all of them executed as his own party and the Republican Party shot most plans down again. Donald Trump will have the same problem as the Democratic Party will try to resist any of his plans. So he needs the support of all the Republicans, else they just tip the balance against him.

The USA is just a binary system which makes it difficult to rule and have a lot of changes. Not many Presidents have managed to make a lot of changes to the whole system. Trump does have the advantage that the Senate is mostly Republican also so for a while, he will be able to execute some of his better plans. But in 2018 we will see the next Senate Election and as the Democrats won 3 seats in the past election, chances are that the Democrats will soon be in control over the Senate and thus stop whatever Trump has planned.

So Trump will have less than 2 years to prove his success as the next POTUS. And while many people hope he will fail, I realise that Trump failing as President will just bring more Chaos in the USA. So I’m expecting a few errors from Trump and a few successes and in 4 years, there will be new elections and by then, the Democrats will likely pick a better candidate.

Would be nice if Faith Spotted Eagle became the next POTUS, though. A Native American Female President! How cool would that be?

The Binary Search problem

Many developers will have to learn all kinds of algorithms in their lives so they can write highly optimized code. Many of these algorithms have long histories and are well-tested. And one of them is the binary search method.

The binary search is a fast algorithm to find a record in a sorted list of records. For most people, this is a very familiar algorithm if you had to ever guess a value between 1 and 10, or 1 and 100. The principle is quite simple. You have an x amount of records and you pick the record in the middle of the list. For guessing between 1 and 100, you would pick 50. (100/2) If it is the correct value, you’ve won. If it is too high, you now know the value must be between 1 and 50 so you guess again with value 25. Too low and you pick the value between 50 and 100, which would be 75. You should be able to guess the value in up to 8 tries for values between 1 and 100.

Actually, the binary search is actually the easiest explained as bit-wise checking of values. A single byte will go from 00000000 to 11111111 so basically all you do is a bitwise compare from the highest bit to the lowest. You start with value 10000000 (128) and if the value you search for is higher, you know that first bit is 1, else it needs to be 0.

Your second guess would either be 11000000 (192) or 01000000 (64) and you would continue testing bits until you’ve had all bits tested. However, your last test could also indicate that you guessed wrong so the maximum number of guesses would be equal to the number of bits plus one.

And that’s basically what a binary search is. But it tends to be slightly more complicated. You’re not comparing numbers from 0 to some maximum value but those numbers are generally a kind of index for an array, and you compare the value at the position in the array. You basically have a value X which could basically be any data type and even be a multi-field record and you have an array of records which has all data sorted for some specific index. And these arrays can be reasonably large. Still, the binary search will allow you some very quick search.

Now, the biggest problem with the binary search is how people will calculate the index value for the comparison. I already said that you could basically check the bits from high to low but most developers will use a formula like (floor+ceiling)/2 where floor would be the lowest index value and ceiling the highest index value. This can cause an interesting problem with several programming languages because there’s a risk of overflows when you do it like this!

So, overflow? Yes! If the index is an unsigned byte then it can only hold a value of 11111111 (255) as a maximum value. So as soon when you have a floor value of 10000000 (128) and a ceiling of at least (10000001) then the sum would require 9 bits. But bytes can’t contain 9 bits so an overflow occurs. And what happens next is difficult to predict.

For a signed byte it would be worse, since value 1000000 would be -128 so you would effectively have 7 bits to use. If the 8th bit is set, your index value becomes negative! This means that with a signed byte, your array could never be longer than 64 records, else this math will generate an overflow. (64+65 would be 129, which translates to -127 for signed bytes.)

Fortunately, most developers use integers as index, not bytes. They generally have arrays larger than 256 records anyways. So that reduces the risk of overflows. Still, integers use one bit for the sign and the other bits for the number. A 16-bit integer thus has 15 bits for the value. So an overflow can happen if the number of records has the highest bit value set, meaning any value of 16384 and over. If your array has more than 16384 records then the calculation (floor+ceiling)/2 will sometimes generate an overflow.

So, people solved this by changing the formula to floor+((ceiling-floor)/2) because ceiling-floor cannot cause an overflow. It does make the math slightly more complex but this is the formula that most people are mostly familiar with when doing a binary search!

Yet this formula makes no sense if you want a high performance! If you want a binary search, you should actually just toggle each bit for the index until you found the value. To do so, you need to know how many bits you need for the highest value. And that will also tell you how many guesses you will need, at most, to find the value. But this kind of bitwise math tends to be too complex for most people.

So, there is another solution. You can promote the index value to a bigger one. You could use a 32-bit value if the index is a 16-bit value. Thus, you could use (int16(((int32)floor+(int32)ceiling)/2) and the overflow is gone again. And for a 32-bit index you could promote the math to a 64-bit integer type and again avoid any overflows.

It is still less optimal than just toggling bits but the math still looks easy and you can explain why you’re promoting the values.

But what if the index is a 64-bit value? There are almost no 128-bit values in most programming languages. So how to avoid overflows in those languages?

Well, here’s another thing. As I said, the index value is part of an array. And this array is sorted and should not have any duplicate values. So if you have 200 records, you would also need 200 unique values, with each value being at least 1 byte in size. If the index is a 15-bit signed integer then the values in the array must also be at least 15-bits and would generally be longer. Most likely, it would contain pointers to records elsewhere in memory and pointers are generally 32-bits. (In the old MS-DOS era, pointers were 20 bits, so these systems could manage up to 1.048.576 bytes or 1 megabyte of memory.)

So, let’s do math! For an overflow to occur with an index as a signed 16-bit integer you would need to have at least 16384 records. Each record would then be at least 2 bytes in size, thus you would have at least 32 kilobytes of data to search through. Most likely even more, since the array is probably made up by pointers pointing to string values or whatever. But 21 KB would be the minimum to occur when using a 16-bit signed index.

So, a signed 32-bit index would at least have bit 30 set to 1 before an overflow can occur. It would also need to contain 32-bit values to make sure every value is unique so you would have 4 GB of data to search through. And yes, that is the minimum amount of data required before an overflow would occur. You would also need up to 31 comparisons to find the value you’re searching for, which is becoming a bit high already.

So, a signed 64-bit index would have records of at least 8 bytes in size! This requires 36.893.488.147.419.103.232 bytes of data! That’s 33.554.432 terabytes! 32.768 petabytes! 32 exabytes! That’s a huge number of data, twice the amount of data stored by Google! And you need more data than this to get an overflow. And basically, this is assuming that you’re just storing 64-bit integer values in the array but in general, the data stored will be more complex.

So, chances of overflows with a 32-bit index are rare and on 64-bit indices it would be very unlikely. The amount of data required would be huge. And once you’re dealing with this much data, you will have to consider alternate solutions instead.

The alternate solution would be hash tables. By using a hash function you could reduce any value to e.g. a 16-bit value. This would be the index of an array of pointers with a 16-bit index so it would be 256 KB for the whole array. And each record in this array could be pointing to a second, sorted array of records so you would have 65536 different sorted arrays and in each of them you could use a binary search for data. This would be ideal for huge amounts of data, although things can be optimized better to calculate to an even bigger hash-value. (E.g. 20 bits.)

The use of a hash table is quite easy. You calculate the hash over the value you’re searching for and then check the list at that specific address in your hash table. If it is empty then your value isn’t in the system. Otherwise, you have to search the list at that specific location. Especially if the hash formula is evenly distributing all possible values then a hash table will be extremely effective.

Which brings me to a clear point: the binary search isn’t really suitable for large amounts of data! First of all, your data needs to be sorted! And you need to maintain this sort order every time when you add or delete items to this record, or when you change the key value of a record! Hash tables are generally unsorted and have a better performance, especially with large amounts of data.

So, people who use a 32-bit index for a binary search are just bringing themselves in trouble if they fear any overflows. When they start using floor+((ceiling-floor)/2) for their math, they’re clearly showing that they just don’t understand the algorithm that well. The extra math will slow down the algorithm slightly while the risk of overflows should not exist. If it does exist with a 32-bit index then you’re already using the wrong algorithm to search for data. You’re at least maintaining an index of 4 GB in size, making it really difficult to insert new records. That is, if overflows can occur. The time needed to sort that much data is also quite a lot and again, far from optimal.

Thing is, developers often tend to use the wrong algorithms and often have the wrong fears. Whenever you use a specific algorithm you will have to consider all options. Are you using the right algorithm for the problem? Are you at risk of having overflows and underflows? How much data do you expect to handle? And what are the alternative options.

Finally, as I said, doing a binary search basically means toggling bits for the index. Instead of doing math to calculate the half value, you could instead just toggle bits from high to low. That way, you never even have a chance of overflows.