Monday, July 21, 2014

Interact with Windows modal dialog from your Selenium scripts

While testing web applications sometimes we encounter Windows popup dialogs (modal windows) that block operation until the dialog is cleared. Below is how I solved this issue. Please note that there are other options out there (e.g. AutoIt), however, I decided against calling an external exe and opted for integrating the tried and true Win32::GuiTest module for this task; mainly because it is written in the same language as the bindings and so there is no need to call external executables as well as learning a new "language", simply call a method.

NOTE: The workhorse of this solution is the SendKeys call.

In your test script:
1:  my $win_id = 0;  
2:  my $win_title = 'Test window title';  
3:  my $win_class = 'Class name';  
5:  $driver->key_press_native( $win_id, $win_title, $win_class, 'ENTER' );  

In the OR in your page object:
1:  sub key_press_native {  
3:    my ( $self, $win_id, $win_title, $win_class, $keycode ) = @_;  
5:    use Win32::GuiTest qw(FindWindowLike GetWindowText SetForegroundWindow SendKeys);  
7:    $Win32::GuiTest::debug = 0; # Set to "1" to enable verbose mode  
8:    # First find the window of interest.  
9:    my @windows = FindWindowLike( $win_id, "^$win_title", "^$win_class\$" );  
11:    # Then we iterate through that list and send the "keys" to any matching window.  
12:    for (@windows) {  
13:      SetForegroundWindow($_);  
14:      SendKeys("{$keycode}");  
15:      Custom::TagSubs::wait_for(2);  
16:    }  
17:    return;  
18:  }  

As you can see from the above, the solution requires the test developer to know the pertinent details of the window we will be interacting with (i.e. window id, window title, window class) as well as the key(s) you would like to press (i.e. keycode). Luckily Win32::GuiTest comes with a tool found in Recoder\Win32GuiTest.exe which you can use to get "Window Hints" (i.e. WinClass, WinTitle, etc).
Some caveats:
  • The solution uses the SetForegroundWindow call which means that it does require the window being interacted with to be able to be brought to the foreground.
Information on Win32::GuiTest can be found here: you can also find key codes in the module documentation.

Feel free to drop a line or comment.


  1. Good post. It's unfortunate this solution doesn't work under Selenium Grid or RemoteWebDriver case where target browser is not on local machine executing the Perl code.

    1. Greetings,
      Thanks! This solution should work if you are using the selenium-server-standalone jar file; folks using the Perl bindings from CPAN are required to used the jar file.

      Why wouldn't you be able to use the solution in a Grid configuration?

    2. Think about it a bit. Where does the Perl code execute from? What does the Perl Selenium bindings do when talking to a RemoteWebDriver? When you execute Perl's Win32::GuiTest module, on what machine is it run against or targeting? In the simple case where you run RemoteWebDriver on same localhost that is executing the Perl code, everything is local. What happens when RemoteWebDriver isn't local anymore?

      I did a blog post about this a while back that hopefully clarifies things:

    3. Hi, I'll try to answer your questions (presuming they weren't rhetorical) 😊

      Where does the Perl code execute from? - the node executing the test local or on another host, the code will be present on that node.

      When you execute Win32::GuiTest on what machine is it run against or yargetting? - on the node that was selected to run the test (local or remote).

      Of course you have to make sure everything is installed on all your nodes but this is part of setting up the environment.

      It would be just,like running autoit on a remote machine only that you're executing a Perl script instead of an exe.

      I'm curious if you have tried this configuration. I have and it works. I'll check out your link now and report back.

    4. Well, when you do it that way, it would work because it is always "local" to the node. The problem I point out is that most people who run into the problem are running the test from a central test agent machine/node A that kicks off the test. In a grid deployment this way, the test code (e.g. Perl for example) runs on test machine A. The test code invokes Selenium driver instances that talk to Grid which are on remote node machines B, C, D, etc. because Selenium Grid handles the remote communications for you, there's nothing to worry about there. But the Selenium test code is intermixed with say AutoIt or the Win32::GuiTest code, and guess what, the user forgot that this code executes locally on machine A (there's no "grid" to handle this remotely). So when they run the test it fails at the AutoIt/Win32 GUI part.

      Now I'm curious how you set up your code to run specifically on each node and not a central machine that kicks off test jobs. I guess that's easier if you use Jenkins agents to fork out jobs to each node or similar but if you use straight Selenium Grid method, that doesn't work out so easily if you don't plan it right.

      So I think I brought up a good point. I don't think your article mentioned that caveat that if you wanted to execute this in Grid/remote fashion, you have to "set up" your environment correctly.

      Also, another note/question, why Perl for the automation? Except for some good reason, from the standpoint of WebDriver alone, if you used Java or any of the official bindings, you wouldn't need to use the server JAR with RemoteWebDriver and can simply remove that dependency and just have the language binding only for invoking Selenium. I say this because you execute the test code locally on the node it should run on rather than via Selenium Grid, etc. so there really isn't a need for RemoteWebDriver (except that Perl requires it as an unofficial binding).

    5. Ah I see. Well yes if the test guy / gal is using the Java bindings or any other that does not require the jar file then this solution will not work. This solution is specifically for the Perl bindings which do require the jar file. I think this is what you are saying also though :)

      When you set up a Grid environment and you are using any of the "unofficial" bindings you basically launch the the jar file in the desired "mode" of operation (hub or node). But the jar file will have to be copied onto all the nodes and the hub.

      And yes, thanks for helping me clarify that this is for the Perl bindings only and not for the Java, C# or other official bindings. Although you can also run the jar file with those bindings I reckon. Can't you?

      Why Perl? I like Perl because well, its the swiss army knife of automation, system administration, etc, etc. Not to mention the large library of modules (such as Win32::GuiTest) that already exist on, it really helps me not re-invent the wheel under most situations :) Another reason is that I have been programming in Perl for a while and so know a few tricks. yet another reason is that (just like in Java) I write one cross-platflorm script and I can run it on all OS / Platform combinations.

      This system I designed (linked), while it doesn't use Win32::GuiTest, depicts how I launch grid in my environments:

      Also in the past, I have by-passed using nodes or hubs altogether and just use a mysql database plus Perl's fork() in order to run scripts not only concurrently on the same machine, but also on remote machines (running the selenium-remote-server of course). Perl is very powerful (just like Java is and C#) one of the best advantages I see with Perl is that its tried and true (i.e. its been around for a long time), longer than Java or any of the other modern languages of choice of today's automation folks.

      Great chatting with you. Where do you work? If you don't mind me asking.

    6. It was good chatting with you too Freddy. I work as QA in the USA in Silicon Valley. I work for an internet company that deals with photos, and used to work in telecom field, leave the rest a mystery to solve :)

      My point of this discussion was that your article as I read it leaves out the caveat to the reader that in order to use this in a non-local deployment (Grid or just RemoteWebDriver running not on localhost) then they must run this same Perl code on that remote host, or if they prefer to run it still on their localhost (or the designated central test machine) then your given code has to be adapted such that it needs to make a remote call (with whatever tool/implementation) to Win32::GuiTest, not the presently given local execution example here. Because Win32::GuiTest offers no option to specify which host machine you want to target unlike RemoteWebDriver. A novice user may not know that and get confused when trying to deploy this on Grid "as is" and find it doesn't work. A good example follow up to this article could be an elaboration to the reader how you would deploy this across multiple hosts to execute. Though in your last comment, the system you designed may be enough for the reader to understand I hope (I haven't looked at it yet).

      And yes, you can use the JAR for RemoteWebDriver for all the official bindings. My earlier comment about that & Perl was, interpreting your code "as is" it appears to be "local" execution (without further context of how you execute it remotely), in which case using official bindings removes one less dependency of the JAR for local execution. In remote/Grid execution, that doesn't matter of course.

      I used to use Perl for work, it is a good language. Still use it when it comes in handy. Seeing its usefulness in testing, I created a remote server implementation for Robot Framework (RF), so one can use Perl better with it: If you ever deal with RF, feel free to improve my Perl remote server. There are a few outstanding issue tickets to do that I haven't gotten around to.

    7. Ha! I love a mystery... :) however, the words photos and silicon valley pretty much solve that one ;)

      I truly appreciate all the feedback. You are correct in that most likely a novice will not be able to implement this solution. I do try to target folks that have been using the Selenium Perl bindings hence I presumed, wrongly so it appears, :) that people know that even with Grid if you're using the Perl bindings you are going to need the server running on all the nodes.

      So to clarify for the novices if you're running in a distributed environment you will need to have not only Perl and the selenium-server-standalone, but also Win32::GuiTest (as well as any other module you use in your scripts) installed on each of the nodes.

      Lets say for example you were using AutoIt in a distributed environment, I presume you will also need to have AutoIt installed on all the machines in the environment; is that correct?

      I do remember you talking about the RF on the Perl LinkedIn group. Actually, I was not aware of that framework until you mentioned it... so thanks!

  2. FYI, one can also use AutoIt (more) natively with Perl via COM. It thus requires COM from Perl but from there it's just writing Perl code calling AutoIt's (COM) API similar to calling Win32::GuiTest model methods. Granted I'd agree it is better to use Win32::GuiTest. However, for those more familiar with AutoIt, or who use AutoIt code across languages, the COM option is nice to consider since it gives you the same API. I actually prefer AutoIt via COM than AutoIt's own scripting language.

    1. Thanks for info regarding the API. I'll check it out.


Creative Commons License
VGP-Miami Web and Mobile Automation Blog by Alfred Vega is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.