The Most Secure Cross Browser Testing Platform since 2012

Blog

Selenium Basics

big-logo
BLOG / Selenium

Selenium Basics

In this post we will cover all the basics to get you started writing Selenium web automation tests. If you don’t already know what exactly Selenium is or how you can setup your Selenium environment please head over to the setup guide.

Now that you are all set lets dive right into the code. In our examples we use C# but the code should be mostly the same in the programming language you are using.

Webdrivers

Selenium calls its central object a webdriver. These webdrivers are used to interact with a specific browser instance. There are 2 different versions of the webdriver available a local and a remote one.

Local Webdriver

The local webdriver is used when the actual browser instance should run on the same machine as the test code runs. So if you want to run the test code alongside the actual browser on your local machine you can use a local webdriver.

If your environment is setup correctly you can start and stop browser instances with a single line of code:

// Starts a new instance of Google Chrome
IWebDriver driver = new ChromeDriver();
 
// Closes the running Google Chrome instance
driver.Quit();

Remote Webdriver

To work with the remote webdriver you will need to have a Selenium Grid running on some machine in your local network or use the testing rig of a third party provider. In this case you will need supply the location of your testing rig and a special settings object detailing which browser you would like to run.

Uri gridUrl = new Uri("http://TestGrid:4444/wd/hub");
ICapabilities capabilities = DesiredCapabilities.Firefox();
 
// Launches a new Firefox instance on the TestGrid server
IWebDriver driver = new RemoteWebDriver(gridUrl, capabilities);
 
// Closes the remote browser
driver.Quit();

This comes handy to run Selenium tests on a dedicated testing machine or in your build process.

It is really important to always call driver.Quit() at the end of your test cases (or the appropriate TearDown method) so the browser does not stay open.

Navigation

Now that we are able to open and close a browser instance we might as well navigate the browser to a page we would like to test. For this the driver object contains a Navigate method. Using this method you can open a new Url, navigate back or forward and refresh the current page. This example opens Chrome and navigates to our homepage:

// Starts a new instance of Google Chrome
IWebDriver driver = new ChromeDriver();
 
// Open a webpage (method will return after the page is fully loaded)
driver.Navigate().GoToUrl("https://www.browseemall.com");
 
// Closes the running Google Chrome instance
driver.Quit();

Finding Elements

Great, now we might want to interact with the page during a test. You know stuff like clicking buttons or entering text into a form field. But before we can do so we need to help Selenium identify the page element we want to interact with.

You can search for an element by LinkText, ClassName, Id, CssSelector, Name, TagName or XPath. Lets try to find the Try For Free button on our homepage based on the link text:

// Starts a new instance of Google Chrome
IWebDriver driver = new ChromeDriver();
 
// Open a webpage (method will return after the page is fully loaded)
driver.Navigate().GoToUrl("https://www.browseemall.com");
 
// Find an element based on the link text
IWebElement element = driver.FindElement(By.LinkText("Try For Free"));            
 
// Closes the running Google Chrome instance
driver.Quit();

Select your search criteria wisely! The optimal search criteria is something that is unlikely to change during development of the page.

Clicking

As we have now managed to identify an element we might as well go ahead and instruct Selenium to click it. Luckily this is really easy:

// Starts a new instance of Google Chrome
IWebDriver driver = new ChromeDriver();
 
// Open a webpage (method will return after the page is fully loaded)
driver.Navigate().GoToUrl("https://www.browseemall.com");
 
// Find an element based on the link text
IWebElement element = driver.FindElement(By.LinkText("Try For Free"));  
 
// Now click the button
element.Click();          
 
// Closes the running Google Chrome instance
driver.Quit();

Text Input

In the same manner we can enter text into an element like a textbox. The example will enter the word Selenium into this blogs search field on the right:

// Starts a new instance of Google Chrome
IWebDriver driver = new ChromeDriver();
 
// Open a webpage (method will return after the page is fully loaded)
driver.Navigate().GoToUrl("https://www.browseemall.com/Blog");
 
// Find an element based on the link text
IWebElement element = driver.FindElement(By.Id("search-field"));  
 
// Now type
element.SendKeys("Selenium");          
 
// Closes the running Google Chrome instance
driver.Quit();

Did you notice that this time we used an ID to identify the element? In a real world scenario the ID is more likely to stay the same than the link text in our clicking example.

Windows / Popups

Sooner or later you will encounter a situation in which the website you test opens a new Window, Tab or Popup you need to interact with. Luckily you can use the webdrivers SwitchTo() method to direct your interactions to any available tab or window. For simplicity Selenium handles everything as a window even if it is really a new tab.

In this example we will navigate to our homepage, open the webinar registration in a new tab and close this tab again.

// Starts a new instance of Google Chrome
IWebDriver driver = new ChromeDriver();
 
// Open a webpage (method will return after the page is fully loaded)
driver.Navigate().GoToUrl("https://www.browseemall.com");
 
// Find an element based on the link text
IWebElement element = driver.FindElement(By.LinkText("Save Your Seat"));  
 
// Now click the button
element.Click();        
 
// Save the current and new window handles for easy access
string currentWindow = driver.CurrentWindowHandle;
string newWindow = driver.WindowHandles[1];
 
// Switch the driver over to the new window
driver.SwitchTo().Window(newWindow);
 
// Close the new window
driver.Close();
 
// Switch the driver back to the first window
driver.SwitchTo().Window(currentWindow);  
 
// Closes the running Google Chrome instance
driver.Quit();

Screenshots

If any of your test cases fail it could be a great help to take screenshots of the website for later troubleshooting. With Selenium you can instruct the browser to take a screenshot anytime. Below example creates a screenshot of our homepage and saves it to disk for later inspection.

// Starts a new instance of Google Chrome
IWebDriver driver = new ChromeDriver();
 
// Open a webpage (method will return after the page is fully loaded)
driver.Navigate().GoToUrl("https://www.browseemall.com");
 
// Take a screenshot
ITakesScreenshot takes = (ITakesScreenshot) driver;
Screenshot screenshot = takes.GetScreenshot();
 
// Save the screenshot for later use
screenshot.SaveAsFile(filepath, ImageFormat.Png);     
 
// Closes the running Google Chrome instance
driver.Quit();

Keep in mind that your screenshots only contain the visible portion of the website. For you might want to use JavaScript to scroll to a specific element first. 

Executing JavaScript

Selenium can also be used to execute arbitrary JavaScript inside the browser you test. In this example we will scroll the page back to the top.

// Starts a new instance of Google Chrome
IWebDriver driver = new ChromeDriver();
 
// Open a webpage (method will return after the page is fully loaded)
driver.Navigate().GoToUrl("https://www.browseemall.com");
 
// Execute JavaScript
IJavaScriptExecutor executor = (IJavaScriptExecutor) driver;
executor.ExecuteScript("window.scrollTo(0, 0);", null);
 
// Closes the running Google Chrome instance
driver.Quit();

The ExecuteScript method will return any primitive type your JavaScript function returns.

There you have it, all the basics necessary to start writing Selenium tests. Next week we will dive into more advanced stuff so stay tuned.