In this blog post, I would like to give you a brief introduction on how to write unit tests for your Bigtable application in Java.
Writing unit tests with Cloud Bigtable Emulator
The main component for unit testing of Bigtable applications is provided by the Google Cloud Bigtable emulator, which simulates a Bigtable cluster in-memory. You may have already heard that you can start this emulator locally using the gcloud command-line tool. But there is also a Java wrapper available to run this emulator programmatically, which is especially suitable for unit testing.
To start using the emulator wrapper, you have to specify following dependencies in your pom.xml if you are using maven (you can look up the latest versions or use google-cloud-bom for dependency management):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
<dependency> <groupId>com.google.cloud</groupId> <artifactId>google-cloud-bigtable</artifactId> <version>1.22.0</version> </dependency> <dependency> <groupId>com.google.cloud</groupId> <artifactId>google-cloud-bigtable-emulator</artifactId> <version>0.131.0</version> <scope>test</scope> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.12</version> <scope>test</scope> </dependency> |
The first thing needed in your unit test class is a BigtableEmulatorRule. This JUnit rule ensures that an emulator instance is started before each test method and stopped afterwards. Since this instance creation process is very fast, it will not significantly affect the overall runtime of the tests.
1 2 3 4 5 6 7 |
public class BigtableTests { @Rule public final BigtableEmulatorRule bigtableEmulator = BigtableEmulatorRule.create(); … } |
Now you can create the tables which are used in your application code. It is useful to implement the required steps in a dedicated setup method which is executed before each test method. The most important configuration part to connect a BigtableTableAdminClient with the emulator instance happens in newBuilderForEmulator() where the port number of the emulator is specified. You can query the currently used port from the Bigtable emulator rule. Also, you have to define project and instance IDs, but you can use arbitrary values for this. With the initialized table admin client, you can create your tables.
The second part of the setup method creates a Bigtable data client, which can be used in your test methods. The steps to connect the client with the emulator instance are equivalent to the ones mentioned before.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
... private BigtableDataClient dataClient; @Before public void setUp() throws IOException { final String projectId = "TestProject"; final String instanceId = "TestInstance"; // Create new tables BigtableTableAdminSettings.Builder tableAdminSettings = BigtableTableAdminSettings.newBuilderForEmulator(bigtableEmulator.getPort()) .setProjectId(projectId).setInstanceId(instanceId); BigtableTableAdminClient tableAdminClient = BigtableTableAdminClient.create(tableAdminSettings.build()); tableAdminClient.createTable( CreateTableRequest.of("example-table") .addFamily("cf", GCRules.GCRULES.maxVersions(1)) ); // Create Bigtable data client BigtableDataSettings.Builder dataSettings = BigtableDataSettings.newBuilderForEmulator(bigtableEmulator.getPort()) .setProjectId(projectId).setInstanceId(instanceId); dataClient = BigtableDataClient.create(dataSettings.build()); } |
Now everything is set up to run queries against the emulated tables. An example test method may look like this:
1 2 3 4 5 6 7 8 9 10 11 |
@Test public void testMutation() { dataClient.mutateRow( RowMutation.create("example-table", "example-key") .setCell("cf", "col", "foo") ); Row row = dataClient.readRow("example-table", "example-key"); assertEquals("foo", row.getCells().get(0).getValue().toStringUtf8()); } |
Connecting application code to Bigtable emulator
As mentioned above, the connection between a Bigtable data client and the emulator is established by specifying the currently used port. But how is it possible to test Bigtable calls in your application? The port is not known beforehand and you do not want to implement special conditions in your application code just for unit testing.
Fortunately, there is an undocumented feature built into the Bigtable data client, or more precisely in BigtableDataSettings, which solves this problem. If you look into the code you see that there is a special handling for the case that an environment variable called ‘BIGTABLE_EMULATOR_HOST’ is set. In this case, an ordinary call of newBuilder() will actually execute newBuilderForEmulator(), as we did manually before.
Sadly, dynamically setting an environment variable in Java is not a trivial task. For this, I used the library System Rules by Stefan Birkner (alternatively you can also use System Lambda).
With this library we can rewrite our setup code as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
@Rule public final BigtableEmulatorRule bigtableEmulator = BigtableEmulatorRule.create(); @Rule public final EnvironmentVariables environmentVariables = new EnvironmentVariables(); private BigtableDataClient dataClient; @Before public void setUp() throws IOException { final String projectId = "TestProject"; final String instanceId = "TestInstance"; environmentVariables.set("BIGTABLE_EMULATOR_HOST", "localhost:" + bigtableEmulator.getPort()); // Create new tables BigtableTableAdminSettings.Builder tableAdminSettings = BigtableTableAdminSettings.newBuilder() .setProjectId(projectId).setInstanceId(instanceId); BigtableTableAdminClient tableAdminClient = BigtableTableAdminClient.create(tableAdminSettings.build()); tableAdminClient.createTable( CreateTableRequest.of("example-table") .addFamily("cf", GCRules.GCRULES.maxVersions(1)) ); // Create Bigtable client BigtableDataSettings.Builder dataSettings = BigtableDataSettings.newBuilder() .setProjectId(projectId).setInstanceId(instanceId); dataClient = BigtableDataClient.create(dataSettings.build()); } |
The difference is that now we do not need to call newBuilderForEmulator() explicitly. We just set the magic environment variable and the clients will automatically connect to the Bigtable emulator. This is very handy, isn’t it?
HBase Mini Cluster vs. Cloud Bigtable Emulator
If you are using the HBase API to communicate with Bigtable, maybe because you migrated from an HBase cluster to the Google Cloud, you may have already written unit tests using the HBase testing utility. This tool simulates an HBase/Hadoop cluster in-memory and is suited to write unit tests for HBase.
But as soon as you plan to run your code against Cloud Bigtable, I recommend you to discard HBase testing utility in favour of the Bigtable emulator. The main reason for this is that Bigtable’s implementation is actually very different to HBase. Therefore, you may experience unexpected behaviour of your application when running against Bigtable, although it worked properly on HBase. With the Bigtable emulator you are able to detect these differences with your tests and thus avoid errors at runtime. Also, the Bigtable emulator is much faster than the HBase testing util, since it does not have to launch a whole Hadoop cluster. So it is worth switching to the emulator.