Night Hour

Reading under a cool night sky ... 宁静沉思的夜晚 ...

A Mini Survey of Email Cloud Services in Singapore

Cloud image

Everything should be made as simple as possible, but not simpler. , Albert Einstein


20 Feb 2017


Introduction

Cloud computing has been in the news for some time now and many companies have already been using or are planning to adopt cloud services. The surveys and forecasts conducted by research organizations such as IDG and Gartner all point to a high rate of growth for cloud computing and increasing cloud adoption. It will be interesting to know how many of the big companies and enterprises in Singapore have adopted some form of cloud services. In this article, we will conduct a mini survey to do this.

General Approach and Method

So how can one go about finding out about cloud adoption among big companies in Singapore ? A logical first step that we can take is perhaps to define the scope of the companies and the type of cloud service. We can look at the listed blue chip companies which make up the STI (Straits Times Industrial) index of the Singapore stock market. As for the type of cloud service, we simply look at email, such as google application suite or microsoft 365. Mail exchanger records (MX) are public and easy to query. With these initial definitions we can start our survey and find out what these 30 STI companies are using.

We need to obtain a list of the 30 companies and the corresponding email domains that they are using. There are various ways to do this. We can come up with a web crawler script that retrieves the SGX STI constituents listing, get the company names, do a look up on the respective stock code and extract the company domain. Alternatively we can do this manually, doing a google search for each company, go to the contact or about page and extract the various relevant email domains that the company has.

In this case, I have decided to go the manual route for now. Scripting a crawler for this purpose though can be left as a future exercise. For the moment, the small number of companies involved and this being an initial first attempt, doesn't warrant a crawler yet, nothing beats a human (at least until AI takes over).

Preparing the Company List

We will use a two column spreadsheet to store the list of companies and domains. A company name column and a email domain column. If there are more than one domains, a comma can be used to separate the domains. This can be exported out as a text csv for processing later. The screenshot below shows what it should look like.

Spreadsheet Columns
Fig 1. Spreadsheet columns.

When saving to csv text, I choose to delimit the fields (columns) by a semi colon, set the text delimiter as blank and use utf-8 character set. I am using the opensource libreoffice application. A single line of the csv text file will have the following format

<Company Name> ; <domain1>,<domain2>,<domain3>...

The csv text file should look like this. The first line is the column label that will be skipped during the processing.

Csv file format
Fig 2. Csv file format.

When going through the various companies to find out their email domains, it is necessary to apply some human judgement. For instance, Singtel has an Australia subsidiary, Optus, so I have included Optus domain. But I didn't include other Singtel partners and associate companies. For Singtel, I have collected 3 email domains, singtel.com, singnet.com.sg, optus.com.au.

In another case, Thai Beverage is the main owner of Fraser and Neave (F&N), so F&N domain is included. I have done the same for the rest of the companies. Some are pretty straight forward where the email contacts are listed clearly on their sites; others need a little human logic and judgement.

MX Lookup Script

Now that we have the look up data (listing of companies and domains) prepared, we create a python script to query the mx records of these domains. The results can then be analyzed later. I am using python 3 and a module called dnspython to do this. If you are on ubuntu, python3 is most likely already installed. To install dnspython, just do

sudo apt-get install python3-dnspython

The basic idea is to read in the csv text file containing the companies/domains listing and create a python list containing name value pairs (company and domain). Then iterate through this list of name value pairs, querying the MX for the domain and printing out the result as individual line containing

Company name; Domain; MX record

Here is a listing of the code that reads in the csv text file and creates the list of name value pairs (company and domain). The name value pairs themselves are python list.

#
# Function to read in the csv and format it into a lookup list containing
# other lists of name value pair, consisting of a company name and a domain. 
# Eg.
#
# [["Company name1","domain1"],["Company name1", "domain2"], ....] 
#

def readcsvfile(csv_filename):
    lookuplist=[]

    fcsv = open(csv_filename, "r")
    line = fcsv.readline()

    for line in fcsv:
       line = line.strip()
       namedomains = line.split(";")
       namedomains[0] = namedomains[0].strip().lower()
       domains = namedomains[1].split(",")
       for d in range(len(domains)):
           domains[d] = domains[d].strip().lower()      
           lookup_name_pair = [namedomains[0], domains[d]]
           lookuplist.append(lookup_name_pair)

    fcsv.close()
    return lookuplist

The actual MX query will be done using the dnspython module. The module can be imported using the following line.

import dns.resolver

Here is the snippet that will do the actual MX query using the list of name value pairs that is created earlier. The dns query code can raise an exception if there is no MX record or if the domain does not exists. These exceptions need to be properly handled. In the case here "NOMX" is set as the mail exchange for domains that cause such exception.

#
# Do a MX lookup for each of the name value pair in the lookup list
# and output the result in the format
# 
# Company Name1;domain1;MX1
# Company Name1;domain1;MX2 
# ....
#
# If there is no MX for a domain, "NOMX" will be used in place of the actual MX record. 
#
def lookupMX(lookuplist):
   
   for i in range(len(lookuplist)):
     
      try:
         answers = None
         answers = dns.resolver.query(lookuplist[i][1], 'MX')
      except dns.exception.DNSException:
         #Do nothing here if there is no MX
         pass

      if answers == None:
         #No MX record for domain
         print(lookuplist[i][0], ";" , lookuplist[i][1], ";", "NOMX", sep="")
      else: 
          for rdata in answers:
             print(lookuplist[i][0], ";" , lookuplist[i][1], ";" , rdata.exchange, sep="")
        
      #Sleep for 2 seconds to avoid excessive DNS query
      time.sleep(2)

   return

This is what the MX query results will look like when the script is run. These results can be redirected into a file and saved for further processing.

Mx query results
Fig 3. MX Query Results.

Processing the Results

We create another script to process the MX query results before doing the actual analysis. What we really want to know is for each company, what are the unique mail exchanger domains that it is using. Is it using a outlook.com or google.com ? If the MX domain is set to outlook.com, then the company is most probably using Microsoft 365, likewise for google.com or some other cloud email vendors.

The processing script will create a python dictionary consisting of company name as the key and a corresponding mail exchanger domain list as the value. The following is the code snippet.

if __name__ == "__main__":

    f = open("results2.txt", "r")
    company = None
    mxlist = None
    namemx = {}
    for line in f:
        line = line.strip()
        part = line.split(";")
        if part[0] != company :
            #new company name, create new list and add it to dictionary
            company = part[0]
            mxlist = []
            namemx[company] = mxlist
            if part[2] != "NOMX" :
               mxdomain = getMxDomain(part[2])       
               mxlist.append(mxdomain)

        else:
            #existing company, add to existing list
            if part[2] != "NOMX" :
               mxdomain = getMxDomain(part[2])
               mxlist.append(mxdomain)             

    f.close()

    
    for k in namemx:
       mxlist = namemx[k]
       namemx[k] = list(set(mxlist)) #Get the unique mx domains
       for unique in namemx[k]:
           print(k, ";" , unique, sep="")       

Here is what running the processing script will look like. Notice that I have piped the output into the unix grep command and filter for specific pattern. In this case, looking at google and outlook mail exchangers.

Processing results
Fig 4. Processing Results.

Analysis of Results and Accuracy

Using the method outlined above, we have already obtained and processed a set of results that we can use for analysis. Since we are not contacting the companies directly to ask about their actual cloud usage, there can be some inaccuracies in the analysis. There is a need to make some assumptions and rely on logical deduction.

The following list some of the assumptions.

  • If the mail exchanger domain is outlook.com, we assume office 365 (at least outlook email must be in used).
  • If the mail exchanger domain is google.com or googlemail.com, we assume google application suite.
  • If the mail exchanger domain is messagelabs.com, we assume it is symantec cloud security service. We interpret this as cloud service being used. Although in reality, some of the real email systems may be on-premise and the symantec cloud is just a layer of protection before routing to the actual on-premise systems.
  • Any other mail exchanger domain belonging to vendors, such as qq.com, mailcontrol.com, pphosted.com etc... are not considered.

A few other things to note, a company can be using multiple services, for example both google.com and outlook.com. When doing up the aggregate result on whether cloud email service is used, there is a need to avoid multiple counting.

For example, if company A uses outlook.com, google.com and messagelabs.com, we will just count it once as using cloud service. On the other hand, as long as one of the email domains under a company uses one of the defined cloud provider, even if all other email domains under the company don't, we will count it as cloud service being used.

In this survey, there can be email domains that have been missed out or are not listed publicly. The coverage is only limited to the 30 STI companies. The wider market may actually be quite different. All these and other factors that we may not have considered or anticipated, can affect the accuracies of the results.

Survey Results

From the analysis, 30% of STI companies actually use microsoft 365 (outlook.com) and about 26% of them are using symantec cloud security. Around 3% uses google application suite. This result does resonate with our common perception, google application tends to be used by technology companies and startups. Traditional enterprises are generally tied to microsoft and vendors like symantec, which has been serving enterprises for a long time.

Cloud Vendors Adoption among STI Companies Cloud Vendors Adoption Among STI Companies Outlook.com(Office 365) Messagelabs.com(Symantec Cloud Security) Google.com(Google App Suite) 30% 26.6% 3% Percentage of cloud vendors share among STI companies
Fig 5. Cloud vendors graph.

The next chart shows cloud adoption among the 30 STI companies. More than half has used some form of cloud email services, rather than having it in-house. For in-house systems , we don't know whether it has been outsourced to a third party vendor for management, or it is fully managed by the company's own IT staff.

53.3% using cloud
46.6% not using cloud
Fig 6. Email cloud service adoption.

Conclusion and Afterthoughts

There are many reasons that companies adopt cloud services, cost savings, flexibility, ease of management, ability to scale quickly, better security and features compared to legacy on-premise applications. For small companies that don't have in-house IT staff, outsourcing to a cloud vendor like google or microsoft will allow these companies to focus on their business rather than worry about IT.

For some other organizations though, there may be regulatory and compliance requirements, the need or desire to better control the company's data which can prevent public cloud adoption. It is also possible that the particular service is strategic to a company and hence will not be migrated to the cloud. Imagine in the early days of amazon, if Jeff Bezos outsources amazon IT and focus on selling online instead, amazon will not be one of the leading technology and cloud vendor today.

Regardless, in this competitive business world, everyone will eventually have to take a hard look and make an assessment on using cloud services, whether public, private or hybrid, in order to keep up with competitors. Already slightly more than half of the STI companies are using some form of cloud email services.

A copy of the scripts used in this survey has been put up at this respository
https://github.com/ngchianglin/Cloud_Email_Survey

If you have any feedback, comments or suggestions to improve this article. You can reach me via the contact/feedback link at the bottom of the page.

Btw. The author runs his own postfix/dovecot mail server for interest and passion, for better control of data and technical know-how. It is strategic.