The Excel Scripting (COM) Interface

In our page on talking to Excel from R we describe how to do some common Excel operations using the Excel API. This is the internal Excel C API. Most people who use Excel are more familiar with the Excel scripting (COM) API – this is the one that you use in VBA, for example.

Using the C API is faster, but there’s a lot of stuff that you can’t do with it. The COM API is slower, but more flexible. Also it’s much better documented, if you like that sort of thing.

The Excel Application Object

The main Application object is installed as EXCEL$Application. The dollar sign indicates that the object is called Application, and it’s in an environment called EXCEL. Most of the things you do with Excel will start with this object. To get a quick look at the methods, try

> ls( EXCEL$Application )

Objects and Environments

Everything that’s an object for purposes of COM is set in R as an environment (with some S3 class names added for identification). That’s why all the syntax uses dollar signs (indexing into environments). To get a range, for example, you could say

> rng <- EXCEL$Application$get_Range( "A1:B3" )

Then rng will be an environment with all the range functions. Try

> ls(rng)

to see what’s in there. Typically you’d use this to either get or set values in the range; for example,

> rng$put_Formula( "=exp(2)" );
[1] TRUE
> rng$get_Value()
     [,1]     [,2]
[1,] 7.389056 7.389056
[2,] 7.389056 7.389056
[3,] 7.389056 7.389056

You don’t have to store intermediate objects if you don’t need them; you can just chain functions together. However if you do that, tooltips won’t be available in the console, because the objects don’t exist yet.

> EXCEL$Application$get_Range( "C4" )$put_Value(100);

Getters and Setters

If you are familiar with VBA, syntax typically looks like

Application.Sheets(1).Name = "NewSheetName"

What’s actually happening here are a lot of property accesses. We can’t structure the syntax the same way in R, we have to be a little more explicit. All of the functions that get properties are named get_X(), and all of the functions that set properties are named put_X().

Also, in the above VBA expression, there’s an implicit index call into the Sheets object, which is a collection of worksheets. It’s really a property named Item. That needs to be made explicit. So the equivalent R expression would be

> EXCEL$Application$get_Sheets()$get_Item(1)$put_Name( "NewSheetName" );

Incidentally, the Excel scripting API is mostly – but not always – 1-based. (Not unlike R). So the above expression will set the name of the first sheet in the workbook.

Enums and Constants

A lot of Excel functions use constant values to set parameters. Often those parameters aren’t easy to guess. Calculation options, for example, use this enum:

typedef enum {
    xlCalculationAutomatic = 0xffffeff7,
    xlCalculationManual = 0xffffefd9,
    xlCalculationSemiautomatic = 2
} XlCalculation;

To make this a little easier we put all of these enums into the EXCEL environment, so you can use them by name:

> EXCEL$Application$put_Calculation( EXCEL$XlCalculation$xlCalculationManual )
[1] TRUE
> EXCEL$Application$put_Calculation( EXCEL$XlCalculation$xlCalculationAutomatic )
[1] TRUE
>

To see a list of enums, use ls( EXCEL ). For any enum in that list, you can use ls to see a list of values; for example,

> ls( EXCEL$XlCalculation )
[1] "xlCalculationAutomatic"     "xlCalculationManual"       
[3] "xlCalculationSemiautomatic"