Pick the Right Locator Strategy during Mobile Automation


   Here focus just on the selector strategies provided by Appium for native iOS and Android testing using the UiAutomator2 and XCUITest drivers.  Here’s prioritized list of locator strategies:

  1. accessibility id
  2. id
  3. XPath
  4. Class name
  5. Locators interpreted by the underlying automation frameworks, such as: -android uiautomator, -ios predicate string, -ios class chain
  6. -image

1. accessibility id
This is the top choice should surprise nobody. If you have the option of using accessibility IDs, use them. Normally an app developer needs to add these specifically to UI elements in the code. The major benefit of accessibility IDs over just the id locator strategy is that while app developers add these IDs for testing, users with handicaps or accessibility issues benefit. People who use screen readers or other devices, and algorithms which inspect UIs, can better navigate your application. On Android, this locator strategy uses the accessibility contentDescription property. On iOS, this locator strategy uses the accessibility identifier. Here’s something surprising: in the XCUITest driver, the accessibility idid, and name locator strategies are all identical.

  They are implemented the same way. Go ahead and try switching your locator strategies, you will get the same results. This may change in the future, but for now you can find an element using the name or text in it because iOS has many ways in which it sets a default accessibility identifier if one is not provided by the developer.

2. id
Element IDs need to be added by a developer, but they allow us to pinpoint the exact element in the app we care about, even if the UI changes appearance. The drawback is you need to be able to talk to your developers. Many testing teams do not have this luxury.

   This locator strategy is pretty similar to accessibility id except that you don’t get the added benefit of accessibility. As noted above, on iOS, this is actually identical to accessibility id. On Android, the id locator strategy is implemented using Resource IDs. These are usually added to UI elements manually by app developers, though are not required, so many app developers will omit them if they don’t think they’re important.

3. XPath
Now this is contentious. XPath is the most expressive and commonly accepted locator strategy. Despite Appium developers warning against XPath’s low performance for years, it still seems to be the most popularly used locator strategy. This is probably because there are many selections that can’t easily be made any other way. For example, there’s no way to select the parent of an element using the simple id selectors. The benefit of being able to express more complicated queries must outweigh the cost to performance for all but the testers whose apps have such large XML element hierarchies that XPath is completely unusable.

     XPath selectors can be very brittle, but they can be responsibly wielded to great effect. Being intentional and carefully picking selectors rather than taking whatever an inspector provides can mitigate the brittleness.

    I think part of the popularity of XPath stems from its use with Selenium and web development, as well as it being the default of many tutorials and inspection tools. When working on Appium I always expected our XPath handling to break more often, but I remember few bugs, probably the benefit open source XPath libraries built for more generalized use.

    The Android OS provides a useful dumpWindowHierarchy which gives us an XML document of all the elements on the screen. From there we apply the XPath query and find elements.

    iOS does not supply a method of getting the entire hierarchy. Appium’s implementation starts at the root application element and recurses through each element’s children and populates and XML document which we can then apply the XPath query to. I still think XPath is unintuitive, especially for those new to programming, but at least it’s a well-accepted industry standard.

4. -android uiautomator, -ios predicate string or -ios class chain
These are the “native” locator strategies because they are provided by Appium as a means of creating selectors in the native automation frameworks supported by the device. These locator strategies have many fans, who love the fine-grained expression and great performance (equally or just slightly less performance than accessibility id or id).

    These locator strategies are crucial for those who have UIs which escape the grasp of the other locator strategies or have an element tree which is too large to allow the use of XPath. In my view, they have several drawbacks. These native locator strategies require a more detailed understanding of the underlying automation frameworks. Uiautomator and XCUITest can be hard to use, especially for those less familiar with Android and iOS specifics. These locator strategies are not cross platform, and knowing the ins-and-outs of both iOS and Android is challenging.

   In addition, the selectors passed to these native locator strategies are not directly evaluated by the mobile OS. Java, Kotlin, Objective C and Swift all lack an eval function which would allow interpreting a string of text as code. When you send an android uiautomator selector to Appium, the text passes through a very simplistic parser and uses Reflection to reconstruct the objects referenced in the text. Because of this, small mistakes in syntax can throw off the entire selector and only the most common methods are supported. This system is unreliable and often encounters difficult bugs.

    If Android or iOS change the testing classes in new updates, your selectors might need to be updated. Using XPath, Appium will keep track of the OS updates and your selectors should keep working.

    A personal quibble I have with these locator strategies is that you are essentially writing a different programming language (such as Java) inside of a string in your test code. Your text editor will not offer syntax highlighting or semantic analysis inside of these queries which makes them harder to maintain. On the other hand, sometimes there’s just no other way, and for those who are proficient in these methods XPath can seem clumsy in comparison.

5. -image
This locator strategy is pretty nifty. Supported on all platforms and drivers, you can pass an image file to Appium and it will try to locate the matching elements on the screen. This supports fuzzy matching, and certainly has its applications, but you should only use it when the other locator strategies aren’t working. It may not always behave deterministically, which mean it can be a source of test flakiness. Also, unless you have a special code editor, they may be a little more difficult to work with and maintain. Of course, visual UI changes will always break -image selectors, whereas the other selectors only break when the element hierarchy changes.

Reference: Appium Pro

make it perfect !

Speed Up Android Appium Test Automation


      There are three special capabilities available in Appium for speeding up Android test initialization (available in the latest Appium version). Also using the appPackage and appActivity desired capabilities instead of the app capability helps for speeding up Android test automation.

  1. skipDeviceInitialization
  2. skipServerInstallation
  3. ignoreUnimportantViews

     skipDeviceInitialization is available for all Android platforms, this desired capability can be passed with the boolean value true to skip installing the io.appium.settings app on the device. This special app is installed by the Appium Android Driver at the beginning of each test and is used to manage specific settings on the device, such as:

  • Wifi/data settings
  • Disabling animation
  • Setting a locale
  • IME settings

Without the io.appium.settings app, the Appium Android driver cannot automate these functions, but if the device previously had this app installed by Appium, it doesn’t need to install it again. If you know that your device is already in the proper state then you can set skipDeviceInitialization to true and skip the time it takes to reinstall it. Appium already checks if the settings app is already installed on the device, but with this capability enabled it even skips the check to see if the app is installed.

     skipServerInstallation desired capability only applies when using the UiAutomator2 automation method. The way the UIAutomator2 driver works, it installs a special server onto the device, which listens to test commands from Appium and executes them. By default, Appium installs this server onto the device at the beginning of every test session. If the value of skipServerInstallation is set to true, you can skip the time it takes to install this server. Of course, without the server on the device Appium can’t automate it, but if you know that the server was installed during a previous test run you can safely skip this step.

   ignoreUnimportantViews desired capability is not new, but it deserves to be mentioned as another way to potentially speed up Android automation tests, especially if your tests focus on finding many elements using XPath locators. Set this to true to speed up Appium’s ability to find elements in Android apps.

     Another major time-saver when it comes to Android tests is using the appPackage and appActivity desired capabilities instead of the app capability. We need to tell Appium which app to test. Usually we use the app capability, which can be a path to a .apk, .apks, or .zip file stored on your computer’s filesystem or stored on a public website. If the app under test is known to already be installed on the device (most likely from a previous test run), the app package name and main activity can be passed instead (using appPackage and appActivity desired capabilities). Skipping the time to download a large file or install it on the Android operating system leads to big savings.

Below is the sample code which explain how can we use it in your automation script:

DesiredCapabilities caps = new DesiredCapabilities();
caps.setCapability(“platformName”, “Android”);
caps.setCapability(“deviceName”, “Android Emulator”);
caps.setCapability(“automationName”, “UiAutomator2”);
// App is already installed on device, so can be launched by Package name and Activity
caps.setCapability(“appPackage”, “io.cloudgrey.the_app”);
caps.setCapability(“appActivity”, “io.cloudgrey.the_app.MainActivity”);
// Skip the installation of io.appium.settings app and the UIAutomator 2 server.
caps.setCapability(“skipDeviceInitialization”, true);
caps.setCapability(“skipServerInstallation”, true);
caps.setCapability(“ignoreUnimportantViews”, true);
driver = new AppiumDriver(new URL(“http://localhost:4723/wd/hub”), caps);

Try to practice the above mentioned methods in your Android Automation and feel the difference in execution speed.

Reference: Appium Pro

make it perfect !

Parallel Test Executions in Automation – TestNG and NUnit


      One day I got an opportunity to automate an application using both Selenium Java and Selenium C#, that was the specific requirement of the customer to do it on both Java and C#. Later customer said to execute different test cases from different class in different browsers in parallel mode. Their aim was to complete the execution of all test cases in a minimum amount of time. Here, I would like to share how I achieved the parallel execution in both Selenium Java and Selenium C#.

Parallel Execution in Selenium Java Automation:

       In case of Selenium Java, I used TestNG’s parallel execution capability with support of attribute parallel=”tests” along with the tag in TestNG XML. I have created two tags to launch two different driver instances. Following is the sample testng.xml which I used for parallel execution:

<suite name=”TestSuite” parallel=”tests” preserve-order=”true”>
<test name=”Test In Google Chrome”>
<parameter name=”browserName” value=”Chrome”/>
<class name=”com.test.TestSuiteClassSmoke”/>
<test name=”Test In Firefox”>
<parameter name=”browserName” value=”Firefox”/>
<class name=”com.test.TestSuiteClassSanity”/>

      In the above testng.xml file, contains one test to run all the test cases of the Smoke Suite in Google Chrome browser and the second test to run all the test cases of the Sanity Suite in Firefox. Both browser instances are start same time and execute in parallel. TestSuiteClassSanity and TestSuiteClassSmoke classes extends AutomationBase class to get the driver instances for execution. Following is the actual implementation of AutomationBase class:

public void SetUp(String browserName) throws IOException, InterruptedException
try {
// implement code to start chrome driver session
else if(browserName.equalsIgnoreCase(“firefox”)){
// implement code to start firefox driver session
}catch (Exception e) {

Parallel Execution in Selenium C# Automation:

      In case of Selenium C#, there is no direct capability like TestNG to achieve the parallel execution of test cases in different instances of browsers. I used NUnit attributes just above in all test classes and that test class extends AutomationBase class. Used following NUnit attributes:

  • [TestFixture(“chrome”)]
  • [TestFixture(“firefox”)]
  • [Parallelizable]

        Added a parameterized constructor (the parameter should be browserName) and call StartBrowser(browserName) method inside the constructor. StartBrowser (browserName) method is implemented in AutomationBase class. Following is the sample implementation that I have done in the test class:

using NUnit.Framework;
namespace com.test.testcases
public class TestModuleClass: AutomationBase
public TestModuleClass(string browserName)

[Test, Order(1)]
public void TC_001_SampleTest()
//Your test steps

    In AutomationBase class, I have added new method StartBrowser(String browserName) to identify the requested browser for execution and implemented the logic to drive chrome and firefox drivers. Following is the implementation of StartBrowser,

public IWebDriver StartBrowser(String browserName)
if (browserName.ToLower().Equals(“”))
throw (new Exception(“BROWSER_NAME is not specified”));
if (browserName.ToLower().Equals(“chrome”))
// implement code to start chrome driver session
if (browserName.ToLower().Equals(“firefox”))
// implement code to start firefox driver session
catch (Exception e)
throw (e);
return driver;

     You can try above mechanisms of parallel executions in automation for both Selenium Java and Selenium C#.

make it perfect !

Automating Voice Commands With Siri


     It’s very common with modern mobile devices to rely on virtual “assistants” to get tasks done, whether in a hands-free situation utilizing voice commands, or just to save the trouble of tapping out search queries. On iOS these interactions take place through the Siri interface.

Hey Siri

     How on earth would you test this aspect of your app’s behavior? Ideally you’d be able to have a recording of the particular voice command or phrase used to trigger your app’s Siri integration, which you could then somehow apply to the simulator or device under test. This is not currently possible, outside of rigging up some speakers!

     Fortunately, Appium has recently added a command (as of Appium 1.10.0), that lets you specify the text you want Siri to parse, as if it had been spoken by a person.

The command itself is accessible via the executeScript “mobile” interface:

HashMap<String, String> args = new HashMap<>();
args.put(“text”, “Hey Siri, what’s happening?”);
driver.executeScript(“mobile: siriCommand”, args);

     Essentially, we construct an options hash with our desired text string, and pass it to the siriCommand “mobile” method. We can run this command at any point in our automation, and it will take care of getting to the Siri prompt for us as well (we don’t need to long-hold the home button).

     At this point we can use the typical native automation methods to verify Siri’s response on the screen, tap on action items, etc…

     That’s basically it! There’s not much to it. So let’s have a look at a full example that asks Siri a math question (What’s two plus two?) and verifies the result (notice how the result text shows up as accessibility IDs, which found by looking at the page source).

public void testSiriTalk() {
HashMap<String, String> args = new HashMap<>();
args.put(“text”, “What’s two plus two?”);
driver.executeScript(“mobile: siriCommand“, args);
wait.until(ExpectedConditions.presenceOfElementLocated(MobileBy.AccessibilityId(“2 + 2 =”)));

     You have to use java-client version as 6.1.0 and Appium server version as 1.10.0. Try to use above sample test case to see what exactly automating voice command with Siri.

Reference: Appium Pro

make it perfect !

AI for Appium Test Automation


     Perhaps the most buzzy of the buzzwords in tech these days is “AI” (Artificial Intelligence), or “AI/ML” (throwing in Machine Learning). To most of us, these phrases seem like magical fairy dust that promises to make the hard parts of our tech jobs go away. To be sure, AI is largely over-hyped, or at least its methods and applications are largely misunderstood and therefore assumed to be much more magical than they are.

         How you can use AI with Appium! It’s a bit surprising, but the Appium project has developed an AI-powered element finding plugin for use specifically with Appium.

          First, let’s discuss element finding plugin. In a recent addition to Appium, added the ability for third-party developers to create “plugins” for Appium that can use an Appium driver together with their own unique capabilities to find elements. As we’ll see below, users can access these plugins simply by installing the plugin as an NPM module in their Appium directory, and then using the customFindModules capability to register the plugin with the Appium server.

       The first plugin worked on within this new structure was one that incorporates a machine learning model from Test.ai designed to classify app icons, the training data for which was just open-sourced. This is a model which can tell us, given the input of an icon, what sort of thing the icon represents (for example, a shopping cart button, or a back arrow button). The application we developed with this model was the Appium Classifier Plugin, which conforms to the new element finding plugin format.

      Basically, we can use this plugin to find icons on the screen based on their appearance, rather than knowing anything about the structure of our app or needing to ask developers for internal identifiers to use as selectors. For the time being the plugin is limited to finding elements by their visual appearance, so it really only works for elements which display a single icon. Luckily, these kinds of elements are pretty common in mobile apps.

         This approach is more flexible than existing locator strategies (like accessibility id, or image) in many cases, because the AI model is trained to recognize icons without needing any context, and without requiring them to match only one precise image style. What this means is that using the plugin to find a “cart” icon will work across apps and across platforms, without needing to worry about minor differences.

         So let’s take a look at a concrete example, demonstrating the simplest possible use case. If you fire up an iOS simulator you have access to the Photos application, which looks something like this:

The Photos app with search icon

        Notice the little magnifying glass icon near the top which, when clicked, opens up a search bar:

The Photos app with search bar and cancel text

             Let’s write a test that uses the new plugin to find and click that icon. First, we need to follow the setup instructions to make sure everything will work. Then, we can set up our Desired Capabilities for running a test against the Photos app:

DesiredCapabilities caps = new DesiredCapabilities();
        caps.setCapability("platformName", "iOS");
        caps.setCapability("platformVersion", "11.4");
        caps.setCapability("deviceName", "iPhone 6");
        caps.setCapability("bundleId", "com.apple.mobileslideshow"); 

Now we need to add some new capabilities: customFindModules (to tell Appium about the AI plugin we want to use), and shouldUseCompactResponses (because the plugin itself told us we need to set this capability in its setup instructions):

HashMap<String, String> customFindModules = new HashMap<>();
      customFindModules.put("ai", "test-ai-classifier");
      caps.setCapability("customFindModules", customFindModules);
      caps.setCapability("shouldUseCompactResponses", false); 

         You can see that customFindModules is a capability which has some internal structure: in this case “ai” is the shortcut name for the plugin that we can use internally in our test, and “test-ai-classifier” is the fully-qualified reference that Appium will need to be able to find and require the plugin when we request elements with it.

Once we’ve done all this, finding the element is super simple:


           Here we’re using a new custom locator strategy so that Appium knows we want a plugin, not one of its supported locator strategies. Then, we’re prefixing our selector with ai: to let Appium know which plugin specifically we want to use for this request (because there could be multiple). Of course since we are in fact only using one plugin for this test, we could do away with the prefix (and for good measure we could use the different find command style, too):


           And that’s it! As mentioned above, this technology has some significant limitations at the current time, for example that it can really only reliably find elements which are one of the icons that the model has been trained to detect. On top of that, the process is fairly slow, both in the plugin code (since it has to retrieve every element on screen in order to send information into the model), and in the model itself. All of these areas will see improvement in the future, however. And even if this particular plugin isn’t useful for your day-to-day, it demonstrates that concrete applications of AI in the testing space are not only possible, but actual!

Please try to implement above mentioned AI component in your automation script.

make it perfect !

Reference: Appium Pro