Phishing & Anti-Phishing Techniques


M.Sc. Engg. CSE

IEB M/33372



Internet has changed the life of human significantly and it has dominated many fields including e-Commerce, e-Healthcare etc. Internet increases the comfort of human life; on the other hand it also increases the need for security measures too. For example all web browsers and servers take almost every care to make guarantee the safe business through internet. Still they are vulnerable to attacks such as phishing. Phishing is a form of online identity theft that aims to steal sensitive information such as online banking passwords and credit card information from users. Phishing scams have been receiving extensive press coverage because such attacks have been escalating in number and sophistication. Phishing is not limited to the most common attack in which targets are sent spoofed (and often poorly spelt) messages imploring them to divulge private information. Instead and as recently documented both in academic and criminal aspects, phishing is a multi-faceted techno-social problem for which there is no known single silver bullet.


  1. Classification of Phishing Attacks: lots of emails are sent to random victims

email urges to update your data via web(spooled one)

victim changes his data Spooled e-mails are sent to a set of victims asking them (usually) to upgrade their passwords, data accounts, etc. MSN, ICQ, AOL and other IM channels are used to reach the victims. Social engineering techniques are used to gain victim’s sensitive information.

Calling the victims on the phone, classic social engineering techniques are used by phishers.

Another kind of attack is based on internet vulnerabilities. This approach is usually used to automatically install dialers.

  • Typical Process of Phishing:In a typical phishing attack[1], phishers send a large number of spooled emails to random no. of internet users that seem to be coming from a legitimate organization. Email urges to provide sensitive information. By clicking on the link provided in the mail, user is directed to a bogus site implemented by the attacker.



  • Phishing Attack Stages


Phishing attacks involve several stages: • The attacker obtains E-mail addresses for the intended victims. These could be guessed or obtained from a variety of sources. • The attacker generates an E-mail that appears legitimate and requests the recipient to perform some action. • The attacker sends the E-mail to the intended victims in a way that appears legitimate and obscures the true source. • Depending on the content of the E-mail, the recipient opens a malicious attachment, completes a form, or visits a web site. • The attacker harvests the victim’s sensitive information and may exploit it in the future. As shown in Figure 1 below, the phishing attack starts with an E-mail to the intended victims. The attacker creates the E-mail with the initial goal of getting the recipient to believe that the E-mail might be legitimate and should be opened. Attackers obtain E-mail addresses from a variety of sources, including semi-random generation, skimming them from Internet sources, and address lists that the user believed to be private [CNET]. Spam filtering can block many of the phishing Emails. If the institution whose customers are being phished regularly uses authenticated E-mail (such as PGP or S/MIME), the recipient may notice that the E-mail does not have a valid signature, thereby stopping the attack. Once the E-mail is opened by the user, the E-mail contents have to be sufficiently realistic to cause the recipient to follow the directions in the Email.




Anti phishing defenses can be server and client based solutions.


Anti-phishing techniques Server Based– these techniques are implemented by service providers (ISP, etc) and are of following types:

Brand Monitoring: Cloning online websites to identify “clones” which are considered phishing pages. Suspected websites are added to centralized “black list”.

Behavior Detection: for each customer, a profile is identified (after a training period) which is used to detect anomalies in the behavior of users.

Security Event Monitoring: security event analysis and correlation using registered events provided by several sources (OS, application, network device) to identify anomalous activity or for post mortem analysis following an attack or a fraud.


Client Based-these techniques are implemented on user’s end point through browser plug-ins or email clients and are of following types:

Email based analysis: email based approaches typically use filters and content analysis. If trained regularly, Bayesian filters are actually quite effective in intercepting both spamming and phishing e-mails. Bayesian algorithm explains the working of Bayesian filter:


Bayesian Algorithm: 1) Split e-mail in tokens.

Need number of messages for spam and legitimate.

Need frequency of each word for each type.


2) Calculate probabilities.

P (legitimate) = word frequency /number of legitimate messages.

P (spam) = word frequency/ number of spam messages.


3) Calculate likelihood of being spam (spamicity) using a special form of Bayes’ Rule where likelihood = a/(a + b), where a is the probability of a legitimate word and b is the probability of spam word.


4) Choose tokens whose combine probability is farthest from 0.5 either way. This is because the farther it is from 0.5 (neutral), with more certainty we can say it belongs to either strategy.

Do this for n numbers for n instance

Combine their probability to get a figure for message using Bayes’ Rule. In basic terms, Baye’s Rule determines the probability of an event occurring based on the probabilities of two or more independent evidentiary events. For three evidentiary events a, b, and c, the probability is equal to a b c .

abc+ (1-a)*(1-b)*(1-c)

If the end result is closer to 1.0, then the message is classified as spam, and if it is closer to 0.0, the message is classified as legitimate.

Black Lists: black lists are collection of urls identified as malicious. The black-list is queried by the browser at run time whenever a page is loaded. If the currently visited url is included in the black list, the user is advised of the danger otherwise the page is considered legitimate.

Information flow: information flow solutions are based on the premise that while a user can be easily fooled by URL obfuscation or a fake domain name, a program will not run. AntiPhish is an example of this type of technique which keeps track of sensitive information that the user enters into web forms, raising an alert if something is considered unsafe.

Similarity of layouts: most advanced techniques try to distinguish a phishing page from a legitimate page by comparing their visual similarities. DOM-Antiphish computes the similarity value extracting the DOM-tree of the considered webpages.


DOM-Antiphish description: when a password associated with certain domain is reused on another domain the system compares layout of current page with the page where the sensitive information was originally entered. For the comparison, DOM trees of the original webpage and the new one are checked. If the system determines that both trees are same in appearance, then phishing attack is assumed.


DOM-AntiPhish Similarity Computation: —–phishing example—

Legitimate web page


Phishing web page

<html><body> Hello </body></html>


Legitimate DOM tree

Phishing DOM tree and if it is closer to 0.0, the message is classified as legitimate.