You can talk to Excel from R using the Excel scripting interface. This is the same interface that VBA uses, we just wrapped it up so you can call it from R code.
The main Application object is installed as EXCEL$Application
. The dollar sign indicates that the object is called Application
, and it’s in an environment called EXCEL
. Most of the things you do with Excel will start with this object. To get a quick look at the methods, try
> ls( EXCEL$Application )
Everything that’s an object for purposes of COM is set in R as an environment (with some S3 class names added for identification). That’s why all the syntax uses dollar signs (indexing into environments). To get a range, for example, you could say
> rng <- EXCEL$Application$get_Range( "A1:B3" )
Then rng
will be an environment with all the range functions. Try
> ls(rng)
to see what’s in there. Typically you’d use this to either get or set values in the range; for example,
> rng$put_Formula( "=exp(2)" );
[1] TRUE
> rng$get_Value()
[,1] [,2]
[1,] 7.389056 7.389056
[2,] 7.389056 7.389056
[3,] 7.389056 7.389056
You don’t have to store intermediate objects if you don’t need them; you can just chain functions together. However if you do that, tooltips won’t be available in the console, because the objects don’t exist yet.
> EXCEL$Application$get_Range( "C4" )$put_Value(100);
If you are familiar with VBA, syntax typically looks like
Application.Sheets(1).Name = "NewSheetName"
What’s actually happening here are a lot of property accesses. We can’t structure the syntax the same way in R, we have to be a little more explicit. All of the functions that get properties are named get_X()
, and all of the functions that set properties are named put_X()
.
Also, in the above VBA expression, there’s an implicit index call into the Sheets
object, which is a collection of worksheets. It’s really a property named Item
. That needs to be made explicit. So the equivalent R expression would be
> EXCEL$Application$get_Sheets()$get_Item(1)$put_Name( "NewSheetName" );
Incidentally, the Excel scripting API is mostly – but not always – 1-based. (Not unlike R). So the above expression will set the name of the first sheet in the workbook.
A lot of Excel functions use constant values to set parameters. Often those parameters aren’t easy to guess. Calculation options, for example, use this enum:
typedef enum {
xlCalculationAutomatic = 0xffffeff7,
xlCalculationManual = 0xffffefd9,
xlCalculationSemiautomatic = 2
} XlCalculation;
To make this a little easier we put all of these enums into the EXCEL
environment, so you can use them by name:
> EXCEL$Application$put_Calculation( EXCEL$XlCalculation$xlCalculationManual )
[1] TRUE
> EXCEL$Application$put_Calculation( EXCEL$XlCalculation$xlCalculationAutomatic )
[1] TRUE
>
To see a list of enums, use ls( EXCEL )
. For any enum in that list, you can use ls
to see a list of values; for example,
> ls( EXCEL$XlCalculation )
[1] "xlCalculationAutomatic" "xlCalculationManual"
[3] "xlCalculationSemiautomatic"