| www.retrosoftware.co.uk http://www.retrosoftware.co.uk/forum/ |
|
| BASIC USR() v.s. ASM JSR http://www.retrosoftware.co.uk/forum/viewtopic.php?f=93&t=797 |
Page 1 of 1 |
| Author: | pstnotpd [ Sun Oct 28, 2012 9:23 am ] | ||
| Post subject: | BASIC USR() v.s. ASM JSR | ||
Hi all, some help would be appreciated. To get back into assembler I'm trying to code some routines to display hexagonal maps in mode2. It's coming along nicely but I can't figure out this behavior. The following routine (with some data parts left out) displays a hexagonal sprite at one of 149 screen positions which are arranged in an hexagonal grid. Due to the EOR it nicely removes the hex again when calling it for the second time. This way it displays a the map defined from 1510 onwards. To speed up things I want to move the map drawing to assembler as well. As a first test to see how fast that would be I made the test loop at line 1320, which I presumed to have the same sort of behaviour. However, somehow the sprite reference seems to get mixed up in this case (see screenshot) So can anybody tell what the USR at 1480 call is doing to make everything (seemingly) correct? I'm not using any stack now so I presume I don't have to save registers as usual. Code: >LIST
10MODE2 20REMVDU28,16,31,19,0 30DIMMC% &500 40FORopt%=0TO2STEP2 50P%=MC% 60[ 70OPT opt% 80.hexer ROL A:STA hexnr 90 TXA:CLC:ROL A:TAX:BCCloscr 100 \Carry is set, so use hscreen position 110.hiscr LDA hscreen,X:STA &81:INX:LDA hscreen,X:STA&80:JMPldspr 120.loscr LDA screen,X:STA &81:INX:LDA screen,X:STA&80 130.ldspr LDX hexnr:LDA spstrt,X:STA &82:INX:LDA spstrt,X:STA&83 140: 150.main \Loop for 6 rows 160 LDA#&06 170 STA &84 180.row \Loop for 6 bloc4s 190 LDX #&06 200.blk LDY #&00 210 LDA (&82),Y:EOR (&80),Y:STA (&80),Y:INY 220 LDA (&82),Y:EOR (&80),Y:STA (&80),Y:INY 230 LDA (&82),Y:EOR (&80),Y:STA (&80),Y:INY 240 LDA (&82),Y:EOR (&80),Y:STA (&80),Y 250 \Increase screen base address by 8 (skip 4 bytes) 260 CLC:LDA #&08:ADC &80:STA &80:BCC spradd 270 LDA #&00:ADC &81:STA &81 280 \Increase sprite base address by 4 290.spradd CLC:LDA #&04:ADC &82:STA &82:BCC nxtblk 300 LDA #&00:ADC &83:STA &83 310.nxtblk DEX:BNE blk 320 \Check last nibble of screen address 330 LDA &80:AND #&0F:CMP #&0C:BEQ fulrow 340 LDA &80:AND #&0F:CMP #&04:BEQ fulrow 350 \Nibble=0 or 8, so substract &2C 360 SEC:LDA &80:SBC #&2C:STA &80:BCS nxtrow 370 LDA &81:SBC #&00:STA &81 380 JMP nxtrow 390.fulrow CLC:LDA #&4C:ADC &80:STA &80:LDA #&02:ADC &81:STA &81 400.nxtrow DEC &84:BNE row 410 RTS 420: 430.spstrt 440EQUW water:EQUW plains:EQUW hills:EQUW desert:EQUW city 450.hexnr EQUB &00 \Hex to be plotted 460: 470\6 rows of 6 blocks of 4 bytes top to bottom 480\block is 4 bytes bottom to top 490.plains 500EQUD&00000000:EQUD&0C040400:EQUD&0C0C0C00 510EQUD&0C0C0400:EQUD&0C080800:EQUD&00000000 520EQUD&04040000:EQUD&0C0C0C08:EQUD&0C0C0C0C 530EQUD&0C0C080C:EQUD&0C0C0C0C:EQUD&08080000 540EQUD&0C0C0C04:EQUD&0C0C0C04:EQUD&0C080C0C 550EQUD&0C0C0C0C:EQUD&0C0C080C:EQUD&0C0C0C08 560EQUD&040C0C0C:EQUD&0C0C080C:EQUD&0C0C0C0C 570EQUD&0C040C0C:EQUD&0C0C080C:EQUD&080C0C0C 580EQUD&00000404:EQUD&0C080C0C:EQUD&0C0C0C0C 590EQUD&0C0C0C0C:EQUD&0C0C0C0C:EQUD&00000808 600EQUD&00000000:EQUD&0004040C:EQUD&00040C0C 610EQUD&000C0C08:EQUD&0008080C:EQUD&00000000 620.water <SNIP> 750.hills <SNIP> 880.city <SNIP> 1010.desert <SNIP> 1140.screen \vertical rows of 10 hexes 1150EQUD&80370030:EQUD&8046003F:EQUD&8055004E:EQUD&8064005D:EQUD&8073006C <SNIP> 1280.hscreen 1290EQUD&6075E06D 1300EQUD&0C3C8C34:EQUD&0C4B8C43:EQUD&0C5A8C52:EQUD&0C698C61:EQUD&0C788C70 1310EQUD&B0393032:EQUD&B0483041:EQUD&B0573050:EQUD&B066305F:EQUD&B075306E 1311: 1320.test LDA#100:STA&85 1330.dal LDA#1:LDX&85:JSRhexer:DEC&85:BNEdal:RTS 1340] 1350NEXTopt% 1360CLS 1370DIMMAP%(10,15) 1380FORY%=0TO9:FORX%=0TO14:READ MAP%(Y%,X%):NEXTX%:NEXTY% 1390FORY%=0TO9:FORX%=0TO14:I%=X%*10+Y%:PROChex(I%,MAP%(Y%,X%)):NEXTX%:NEXTY% 1400REMFORi%=0TO149:PROChex(i%,0):NEXTi% 1410REMCALL(test) 1420: 1430END 1440: 1450DEFPROChex(pos%,type%) 1460LOCALA%,X%,Y%,R%,C% 1470A%=type%:X%=pos%:Y%=0 1480R%=USR(hexer) 1490ENDPROC 1500: 1510DATA 0,0,0,0,1,1,1,1,2,2,2,2,1,1,1 1520DATA 0,0,0,0,0,1,1,2,2,2,3,3,1,2,3 1530DATA 0,0,0,0,0,0,1,1,2,2,2,3,1,2,3 1540DATA 1,0,0,0,0,0,0,1,1,2,2,2,1,2,3 1550DATA 1,0,0,0,0,0,0,1,1,4,4,2,2,2,2 1560DATA 1,1,1,0,0,0,0,0,1,1,4,4,1,2,2 1570DATA 2,1,1,1,1,0,0,0,1,1,4,1,1,1,2 1580DATA 2,2,1,1,1,1,1,0,1,1,1,0,0,0,2 1590DATA 3,2,2,2,1,1,1,1,0,1,1,0,0,2,2 1600DATA 3,3,2,2,2,1,2,1,0,1,0,1,0,0,2 >
|
|||
| Author: | pstnotpd [ Sun Oct 28, 2012 1:56 pm ] |
| Post subject: | Re: BASIC USR() v.s. ASM JSR |
Found the culprit!!!!! It always helps to formulate the question The first ROL A rolls "in" a set carry flag. Putting a CLC at the start fixed it. However, if anyone has pointers how to speed up the routine let me know. Although displaying the map through machine code is much faster, it will still not make for a smooth refresh I'm afraid. Now to move everything to Sideways Ram and make it an OSWORD..... |
|
| Author: | RichTW [ Mon Oct 29, 2012 2:06 pm ] |
| Post subject: | Re: BASIC USR() v.s. ASM JSR |
Hi! Looking nice! ASL A is a left-shift which shifts in a zero rather than the carry flag, so this is preferable to CLC:ROL A. I can see a few additional improvements, but nothing major. Instead of: Code: 250 \Increase screen base address by 8 (skip 4 bytes) do:260 CLC:LDA #&08:ADC &80:STA &80:BCC spradd 270 LDA #&00:ADC &81:STA &81 280 \Increase sprite base address by 4 290.spradd Code: 250 \Increase screen base address by 8 (skip 4 bytes) 260 CLC:LDA #&08:ADC &80:STA &80:BCC spradd 270 INC &81 280 \Increase sprite base address by 4 290.spradd (and similar for the block below) These lines: Code: 320 \Check last nibble of screen address can, I think, be more simply expressed as:330 LDA &80:AND #&0F:CMP #&0C:BEQ fulrow 340 LDA &80:AND #&0F:CMP #&04:BEQ fulrow Code: 320 \Check last nibble of screen address 330 LDA &80:AND #&07:CMP #&04:BEQ fulrow Then, at line 360, a similar optimisation to the first one: Code: 360 SEC:LDA &80:SBC #&2C:STA &80:BCS nxtrow becomes:370 LDA &81:SBC #&00:STA &81 Code: 360 SEC:LDA &80:SBC #&2C:STA &80:BCS nxtrow 370 DEC &81 And finally, if it gets to fulrow, we know C is already set (from the CMP earlier), so we can save a CLC by changing: Code: 390.fulrow CLC:LDA #&4C:ADC &80:STA &80:LDA #&02:ADC &81:STA &81 to:Code: 390.fulrow LDA #&4B:ADC &80:STA &80:LDA #&02:ADC &81:STA &81 If you could find an efficient way to do so, i.e. with little overhead for the non-true case, it might be worth checking if you're plotting the top or bottom row, and if so only plotting the middle 4 byte columns (as the first and last will be blank in the case of a hexagonal sprite). This'd only be worthwhile though if you could incur little overhead in the other rows, otherwise the extra checks would negate any optimisation gained. |
|
| Author: | pstnotpd [ Mon Oct 29, 2012 2:48 pm ] |
| Post subject: | Re: BASIC USR() v.s. ASM JSR |
Tnx, gonna give those tips a go. Actually I already pondered leaving out the corner blocks. The added benefit would be that the definitions would actually align to pages (2 sprites per page). When that's the case the sprite base address calculation would not have to take the high byte into account if the sprite start address is correctly aligned. I'd have to count the cycles thought to see if it will actually speed things up. Of course the speed doesn't matter that much on a fixed map, but I'd like this to window a much larger map and do some kind of scrolling. If the hex plotting routine is moved to SWR and can be called through an osword, the map can be whatever a 2nd processor can hold. |
|
| Author: | RichTW [ Mon Oct 29, 2012 4:20 pm ] |
| Post subject: | Re: BASIC USR() v.s. ASM JSR |
Yep, that's a good idea, and if you align the sprite data to page boundaries, you won't incur the 1 cycle penalty in LDA(),Y for crossing a page boundary. I'd be inclined to handle the top row and bottom row as special cases outside of the main loop, maybe just as a subroutine which is called before and afterwards. |
|
| Author: | pstnotpd [ Tue Oct 30, 2012 7:54 am ] |
| Post subject: | Re: BASIC USR() v.s. ASM JSR |
Added your suggestions and they work fine. I like your take on the nibble check. Had to write it down to see how that worked. As expected it doesn't really have much visible impact on a complete screen refresh. I expect removing the corners won't do that either. First priority now is the SWR OSWORD code. Going through Bruce Smith's book. |
|
| Author: | jgharston [ Sun Nov 04, 2012 5:35 am ] |
| Post subject: | Re: BASIC USR() v.s. ASM JSR |
pstnotpd wrote: Now to move everything to Sideways Ram and make it an OSWORD..... Make sure you use the correct layour for the control block:BeebWiki:OSWORD and make sure you don't clash with an existing OSWORD: BeebWiki:OSWORDs It's best practise to float the proposed OSWORD number and parameter block layout on the BBC Micro Mailing List before developing the code. |
|
| Author: | pstnotpd [ Sun Nov 04, 2012 8:55 am ] |
| Post subject: | Re: BASIC USR() v.s. ASM JSR |
jgharston wrote: It's best practise to float the proposed OSWORD number and parameter block layout on the BBC Micro Mailing List before developing the code. I've been a "lurker" on the mailing list for quite a while now For testing I'm going with OSWORD &65 as in Bruce Smith's book. I first want to get it working. And by working I mean it should be possible to show hexes from the ARM7 scheme interpreter. For now I managed to remove the 4*4 byte blocks at the corners like RichTW suggested so the hex definitions are now (and must now be) nicely page aligned per 2. And I'm pleasantly surprised the code seems to work fine in all 20K modes (0-2). For the "wargame" purpose I have in mind it might be better to go with detailed mode 0 hexes..... To be cont'd |
|
| Page 1 of 1 | All times are UTC [ DST ] |
| Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/ |
|