The main thing is to calculate the screen address as infrequently as possible (once ever is a good start

If you have 1 row = 1 page, you don't have to worry about crossing a page boundary as you draw the chars/columns of your sprite, so you can just use ,x/y
I usually keep the fractional parts separate and the main parts in y (rows/pages), x (chars)
for pixels (256x256 mode 4)
addr = &4000 + (y/8)*256 + (x/8)*8 + y%8
in assembler, this simplifies to
addr_hi = &40 | (y >> 3)
addr_lo = (x & &F8) | y & 7
in asm (not tested, just typed in here):
tya : lsr A : lsr A : lsr A : ora #&40 : sta &71
txa : and #&F8 : sta &70
tya : and #&07 : ora &70 : sta &70
NB this only trashes A, not X or Y and doesn't mess with the carry - think of it as "premature optimization" or "saving yourself work later"

I wouldn't use the third line, as I would probably be using the vertical offset within a char to pick a custom drawing routine or to offset the src/dst.
I probably wouldn't store the second line in &70 as I would keep that as 0 and just put the char offset in Y.
You could save a byte and a couple of clocks by changing the first line to something like
tya : lsr A : SEC : ror A : lsr A : ora #&40 : sta &71
PS changing mode yourself is quite easy and can be handy if you need to do some relocation and don't want to clear 10K, when you are using 8K and aren't sure which bytes will get overwritten if you do.