Milestone Paper: Research

 

 

1. Introduction

 

250,000 passwords are hacked every week, according to a study by Google, the University of California Berkeley, and the International Computer Science Institute. According to the password manager Keeper Security, the biggest threat to password security is the use of weak passwords, with the top three most common passwords of 2017 being “123456”, “Password”, and “12345678”.

As the critical nature of information stored online increases, the number of attacks to acquire the said information also increase. Passwords bare the majority of the burden of safeguarding our digital identities and information. Companies thus encourage, and more recently require, users to increase the strength of their passwords by using a combination of alphanumeric, uppercase, lowercase and special characters. However, this is where the problem lies – passwords that are difficult to hack, are also difficult to remember. With the increasing number of activities and accounts we have online, the sheer number of passwords possess a challenge for users to remember. The current practice – a compromise on the strength of passwords in exchange for ease of remembrance. With almost all sectors of our lives being digital, password security is a relevant and growing concern.

 

2. Domain Map

My domains of interest are as follows:

Domain Map updated.jpg

Fig 1: Domains of Interest

My core domain of interest is system design. All the other domains that I am interested in, fall under the domain of system design, at least partially. I look at problems from a systematic standpoint since I feel that it is one of the best ways to target the root cause of a problem and solve it effectively. System design has multiple aspects to it—I am particularly interested in the role of cybernetics in the domain; how feedback from one element in the system affects and controls the others, and how the communication between them (both humans and machines) takes place.

My other domains of interest include social psychology, human computer communication, A.I. and machine learning, ubiquitous computing, and more recently, cybersecurity. All of these domains, when observed, have a recurring underlying pattern which forms the reason for my interest in them. They are all an interconnected web of various elements interacting and affecting each other on a complex and non-linear level. This element of everything being connected to one another and not an existing as a single isolated object is what interests me the most. It wasn’t until I tried to think about what all these seemingly different domains have in common that I realized that system design was what I was interested in, or that it was even a term to begin with.

Cybersecurity is one of the domains I accidently crossed paths with, and did not even realize I was interested in, till someone pointed out the similarities in my projects. I did two projects last semester on cybersecurity and that is when I got interested in this domain. Lastly, I encapsulated all of these domains in an overarching domain of entrepreneurship which is because I am interested in working on something practical, something that solves a problem and creates an impact in the world—something that is needed and ideally can be launched by the end of my thesis.

 

3. Research

3.1. Hacking methodologies and prevention methods

Various methodologies are used by hackers to gain access to passwords. This section presents the most common methods of hacking and the current technologies used to prevent it.

3.1.1. Dictionary and brute force hacking

Hackers implement a variety of ways to try to crack passwords and gain access to user accounts.

The most common forms of password hacking include dictionary hacking and brute force hacking. Dictionary hacking includes using a “dictionary” that contains a list of words from a literal dictionary, as well as known passwords and word substitutions to gain access to an account (sources: Microsoft, Keeper Security, Huffington Post). Brute force hacking involves “a trial and error method used by application programs to decode encrypted data such as passwords or Data Encryption Standard (DES) keys, through exhaustive effort (using brute force) rather than employing intellectual strategies”. In this technique, all possible combinations of passwords are tried. Machines costing lesser than $1000 are capable of testing billions of passwords per second.

Dictionary hacking and brute force hacking is especially easy when users have weak passwords that use words from different languages, personal information, etc. Even replacing “a” with “@” in a dictionary work like apple will not deter the hacking software since it already takes into account the common substitutions made by people.

As one of my research artifacts, I carried out a survey of eleven individuals who were asked to indicate whether they did any of the following things:

  1. Use the same password in multiple places?
  2. Add “!” onto the end of a password (or other similar symbols/numbers)?
  3. Implement “@pp1e instead of apple” to make your passwords more secure? (replace “a” with “@”, “1” or “l” with “!”, “O” with “0”, etc.)
  4. Use personal information in passwords?

As seen in Figure 2 below, the survey revealed a mode of three, i.e. the surveyed individuals employed three out of the above four features in their passwords.

Final Research Presentation.005.jpeg

Fig 2: Number of people fulfilling password habit criteria

The best way to create strong passwords is using a random combination of upper case and lower case characters, numbers and special characters. A random combination, in this context is defined as a combination that does not have any existing words in a language or any pattern by which it is produced. A random combination of these in a thirteen digit password length was considered to be impractical to break in 2016. Currently a 20 digit password of such a combination is considered impossible to hack, according to Anshul Srivastava, one of the experts interviewed in the domain. However, it is very difficult, if not impractical for people to remember such long passwords for different accounts everywhere, especially if they are supposed to be random combinations that do not make sense. Existing technology such as password managers are currently on the market to help with this issue.

Password managers “generate, retrieve and keep track of” passwords for users. However, the usage of password managers is low, with a survey by the password manager Roboform revealing only eight percent of surveyed individuals used password managers. This was reflected in my survey which revealed that eighty percent of the surveyed individuals did not use password managers, either because they were not aware of them, but mostly because they did not trust password managers with their passwords. To the question, “Do you know what password managers are? Do you use them? Why / why not?”, two responses were:

“I know, but I don’t use them. Because they are not helpful for my life. What if my password manager is hacked?” – response of one of the surveyed individuals.

I don’t trust it. Makes things more complex. ” – response of one of the surveyed individuals.

Password managers store passwords on their servers and if hacked, users stand to lose all their passwords, most of which they don’t even remember, as evident from the data breach of the popular password manager LastPass in 2015, followed by a discovered vulnerability by Google security researcher Tavis Ormandy in 2017. Thus, users prefer to have weaker passwords that are just known by them over stronger passwords that are saved over the cloud. This issue of the lack of trust over some entity remembering user’s passwords is a major drawback of password managers and something that needs to be kept in mind.


Thus, this thesis proposes a solution through a mapping technique that helps users use passwords that are easy for them to remember, does not store their passwords anywhere—neither on their their device, nor over the cloud—but still maps them into difficult to hack passwords. By inverting the existing relationship of passwords that are easy to remember also being easy to hack to passwords being easy to remember but difficult to hack, the thesis is addressing an increasing need in the market to target the issue of simple passwords. No matter what the latest encryption techniques, if the base password in itself is not strong, other technologies can only do so much to safeguard passwords.

3.1.2. Keylogging

Keyloggers record the keys struck on a keyboard. In terms of hacking, they record not only keystrokes, but also every click, touch, download, and conversation. They can thus collect information like bank account numbers, user IDs and passwords, personal information, pin codes, and other data, which is then sent to the hacker’s server. Keyloggers also have the ability to take screenshots of the user’s device screen or access items copied to the clipboard. Cybersecurity company McAfee mentions how keyloggers can also be executed as a part of a rootkit which can evade manual detection or antivirus scans.


In a tip to protect against keylogging, McAfee mentions, “Try an alternative keyboard layout – Most of the keylogger software available is based on the traditional QWERTY layout so if you use a keyboard layout such as DVORAK, the captured keystrokes does not make sense unless converted.” This indicates that at least a certain type of keyloggers can’t function unless:
a) They know that the layout of the keyboard is different and
b) They know what the correct layout is
By randomizing the layout of the keyboard for each individual user, and mapping each key on a keyboard to two distinct keys, we get the following calculations as the number of possible combinations of the keyboard:
1) There are 55 keys on a standard QWERTY keypad on mobile devices (includes
    alphanumeric and special characters allowed in passwords)
2) Mapping them to one keyboard would result in 55! Combinations
3) Mapping them to two keyboards (mapping twice) would result in 552!
Thus the total number of unique mapping possibilities for each keyboard would be:

552! = 3025! =

Thus, for a hacker to know the correct mapping of a keyboard, they would have to go through infinite combinations, and this would help prevent keylogging along with brute force hacking.

Mr. Kiran Vangaveti, CEO of BlueSapphire Technologies, a cybersecurity company mentioned in his expert interview that changing the layout would only work on one kind of keylogging, since many keyloggers read the ASCII values that are generated by the keyboard driver. There would thus be a need for encryption before those values are ready within the driver.

3.1.3 Shoulder surfing and surveillance

Shoulder surfing is the act of someone looking over a person’s shoulder or spying on their screen or keyboard as they enter a password so as to steal it. This can be especially dangerous in public or other crowded areas where a lot of people stand in close proximity to one another. A similar technique is surveillance which involves CCTV or hidden cameras to be able to see what a person is typing.

 

3.2 Biometrics, 2 Factor Authentication, and other techniques

This thesis is targeting the root problem of people using weak passwords to protect their data and not necessarily competing with biometrics or 2 factor authentication which are either alternatives to passwords or technologies that work in conjunction with passwords to make data safer. However, to get a fuller picture, let’s briefly dive into the problems with both of these techniques individually, especially since they aren’t apparent at first.

Biometrics are permanent, meaning that once compromised, unlike a password, cannot be changed. Thus, if a hacker gets a copy of a user’s fingerprint, every place the fingerprint has been used is permanently compromised. Given that fingerprints can be replicated in as less as $5, or be digitally recreated using just a person’s photo showing his/her hand, biometrics are not as secure as they seem. Additionally, most websites and authentication portals still require the use of passwords, and so there is still a need to have stronger passwords since biometrics don’t always replace password usage.

2 Factor authentication (2FA) requires two out of three types of authentication credentials to gain access to an account – something someone knows, something someone has, and something someone is. Frequently, they combine passwords (something someone knows) with an additional factor for authentication; however the need to stronger passwords still exists. It is important to know that one of the weakest points for 2FA is the wireless carriers over which the text messages are sent to the user as a 2FA credential.

 

4. Research methodologies and artifacts

Research methodologies used in this phase include:

  1. User Survey
  2. Expert Interviews
  3. Secondary Research
  4. Market Research
  5. Technical Evaluation
  6. Literature Review
  7. Audience Mapping

The results from the above research methodologies has been included throughout this paper as facts, opinions, and decisions.

In addition, listed below are the results from the user survey, target audience, and expert interviews since they need to be mentioned in more detail.

4.1. User Survey:

Problem Identified: Not just lack of awareness, but lack of practical solutions

Methodology link

Users can be of two types, one who are not aware of the problem of weak passwords, and two, who are aware of it but don’t have a better alternative. Along with the statistics shown in Fig 2, the survey I conducted with eleven individuals revealed the following statistics:

  1. 70% respondents rated 5 (most important) for how important it is for them to keep their passwords safe. 10% rated 4, 10% rated 3 (Fig 3).
    Thesis Passkey Pitch.022.jpeg
    Fig 3: Importance of password security to respondents

  2. Most of the respondents agreed that “Alex@1209” and “Cupc@ke!” were more likely to be used as a real-life password, but “5Gkfd9&#” and “gk3+8(f$” were more secure / more difficult to hack.
  3. 60% of respondents said they weren’t satisfied by how strong their passwords were. 20% said they were fine with some of them / could be stronger. 20% were satisfied (Fig 4).
    Thesis Passkey Pitch.023.jpeg

    Fig 4: Respondent satisfaction with the strength of their passwords

The survey reveals that users understand the need to have strong passwords, want to have stronger passwords, but end up not having strong passwords, either/both because of lack of means to make it practically possible to remember difficult passwords, or not knowing that the techniques they are currently using (for example, character substitutions, using @ instead of a, or adding an ! after their passwords) are not that unique and already being accounted for by hackers. This thesis aims to help users fulfill this need.

The other type of users are those who are not aware of the need to have strong passwords. In that case, this thesis will still help them automatically have strong passwords without even having to think about it.

4.2. Target Audience

The target audience for the thesis can be divided into three categories viz., core audience/users, secondary users and tertiary users.

Core group: – First adopters

  1. Age: 16-65 years.
  2. People who care / are concerned about their password security.
  3. Tech savvy people who have important data online and want to protect it
  4. People who know about hacking and security threats.
  5. People who have easier passwords (are not satisfied by the strength of their current passwords/wish they were stronger) but don’t have a practically feasible way of doing that; thus compromising password security for the practicality of remembrance.
  6. People who considered password managers but did not use them because of security concerns.

Secondary Group: – Education needed

  1. Age: 16-110 years.
  2. People who are indifferent about password security.
  3. People who are currently using weak passwords and don’t know
  4. People who don’t know about hacking and security threats

Tertiary Group: – will only use if inbuilt in system (eg. Android, Windows, etc)

  1. Age: 10/12 years – 120 years.
  2. People who already use very strong passwords (power users – may not trust anything or anyone else).
  3. Any person using passwords.
  4. People who use password managers (may convert if they see benefit – already see benefit of password managers so may not convert).
  5. People who don’t care about their password security.

 

4.3. Expert Interviews

Methodology link
Complete notes

Cybersecurity being such a niche and specialized field, expert interviews were a crucial part of my research. Since I had a semi-clear idea of what I was doing and how the mapping technique was going to work, I wanted to conduct qualitative interviews in the form of a discussion. The goals of the interviews were: clarify some of the doubts in the details of the exact working and point of intervention of keyloggers (which wasn’t available online) and discuss my idea to see if it made sense and was not something that could easily be hacked into.

The questions were more conversational and informal, and I ended up with not only direct answers to my questions, but also a lot of advice, guidance, suggestions, tips and resources to better my idea. These were roughly the questions I asked:

  1. What is the exact point of intervention of a keylogger? Does it only record the screen? Does it take into consideration the position of the key that is pressed or its ASCII code (value)? Does the keylogger retrieve the value when the data is being sent from the keyboard to the computer system, or from there to the screen? Basically where in the process of the user hitting a key to it being displayed on the screen is the keylogger functioning?
  2. What are the current ways in which keylogger hacking is being prevented? What are the ways in which it can be prevented? What techniques/algorithms can be used?
  3. [Explain my concept and functioning of the password mapping to them]. Does the project idea seem viable? Is it something that can easily be hacked by a hacker in the field? Does the idea make sense from a technical perspective? Are there any weaknesses that would need to be addressed?
  4. Which programming language would be the best to use for coding out my project? What things should I keep in mind?
  5. Any possible means of hacking you can foresee in my concept? Any loopholes that I need to address? Is the concept viable? Does it have potentia? What are its pitfalls?

I did not get a chance to ask the third interviewer questions regarding the feasibility of my idea and could only ask him about the mechanics of keylogging and how to work around it. The general feedback regarding my idea with the first two interviewees was as follows:

  1. Both agreed that there was a need to help encourage users to use stronger passwords.
  2. Both the interviewers were excited about the idea and said that there was great potential to explore the idea further.
  3. Both the interviewers said that the idea needed to be built upon to make it more resistant to hacking. They gave suggestions such as the usage of hash functions, automatic update and randomization after a fixed period of time, protection of the app from hackers attacking the actual software, etc.
  4. Both the interviewers explained the working of keyloggers, and the slight variation in their explanation helped me get a more comprehensive picture.
  5. Both acknowledged that if the project was successful, it would help solve a gap in the market and achieve its goal of helping users have strong passwords.
  6. Both mentioned that this was a possible idea to execute, though it needed more research and understanding of different algorithms (such as hashing function). Eventually, if the idea was to be launched, then professional programmers would be needed.  

The following sections contain details about the interviewee profiles. The complete notes on the interview can be found here:

4.3.1. Interviewee 1

Name: Akash Pardhi
Profile: MTech in Computer Science with specialization in Cybersecurity (Data security)
MTech thesis in Cryptography
12+ years professional experience in IT industry
Interview type: Phone call (call duration: approx. 51 mins)

4.3.2. Interviewee 2

Name: Anshul Srivastava
Profile: Engineer with 10+ years of work experience at firms such as IBM
Interview type: Phone call (call duration: 1:21:52 – 1 hour 21 mins)

4.3.3. Interviewee 3

Name: Kiran Vangaveti
Profile: CEO of BlueSapphire Technologies, an IT Security Services Company. Previously worked at companies such as GE as the director of Data Protection. Education in Computer Science.
Interview type: Phone call (call duration: approx 20 mins)

4.4. Technical Evaluation

Technical evaluation was carried out on the terms mentioned by the expert interviewees or by me that are currently in use, or could potentially be used in my project implementation. These terms included, public key and private key, hashing functions, encryption and decryption, MDS for randomization, SQL injection, RSHA, array for sending passwords across a network, machine learning as a black box for randomization, etc. The goal of the technical evaluation is to understand how these different technologies work and how they can be combined in different ways to create a project more resistant to hacking. This information will be useful during the prototyping phase to decide which technologies to use.

 

5. Conclusion

The research has helped me out in the following way, both, for foundational research as well as a guide for going ahead into the prototyping phase:

  1. The project had started out as a personal need but has grown into so much more through research. Secondary research, user surveys and interviews, market research, all have helped identify and confirm the presence of a gap in the market—a space for innovation.
  2. It has helped validate many hypotheses that I had going into the project’s research phase.
  3. Expert interviews confirmed the need of innovation in this area from a technical standpoint, helped identify shortcomings and pitfalls that need to be addressed, and gave suggestions on how to overcome them. Most of all, they helped confirm that the project had a sound grounding and would be able to work in a field as complex as cybersecurity.
  4. Audience mapping helped get a clearer understanding of the different audiences and how the project would need to be developed after the MVP, should we want to encompass the secondary and tertiary audiences as well.
  5. Technical evaluation and literature review helped in learning more about the technologies suggested by expert interviewees. It also helped to get a better understanding of how to combine different encryption technologies to create different sets that can be prototyped and tested in the next phase of thesis.

The design attributes that would inform my project moving forward are as follows:

 

  • Trust: Passwords being such a critical piece of information, trust by the user in what they are using is essential. As seen from the research on password managers, if users don’t feel like they can trust a piece of technology, they would rather not use it despite the benefits it may provide them with.
  • Transparency: Transparency means transparency in how the technology works. By being more transparent, users will be able to trust the product more.
  • Control: Users should always feel like they are in control of the technology they interact with. In the context of passwords, should they wish, they should be able to customize the amount of control the application has to a level that they feel comfortable with. If users feel a lack of control at any point in the experience, they will stop using the product.
  • Awareness: Awareness regarding the importance of having stronger passwords and how current techniques users employ are predictable and not strong enough. Awareness, especially to the secondary and tertiary target audiences to make them realize why this product is important, or on a more fundamental level, why password security is something to think seriously about.

 

 

The next phase would be to create a plan to strategize all the information I have collected in the research phase to create prototypes to test, incorporating any further research I do to iterate over my prototypes and project. Research in this field is very important and will be an ongoing process throughout my thesis to further strengthen my thesis, whether it be through more expert interviews, secondary research, literature review,  user testing, or any other methodology relevant to the need of the project at that point in time.

 

Leave a comment